Novel Statistical Methods for Data with Missing Values

Chen, Hua

Abstract

Missing covariate values are common in studies of risk factors of diseases and in many other biomedical studies. Simple complete-case analysis which is routinely used suffers from bias in addition to efficiency loss. Current advanced statistical methods for analyzing such data have limited usage in practice because of the robust concern, or the difficulty in implementation, or both. This project aims at developing new statistical methods for modeling missing covariates in regression models to make inferences on regression parameters with missing covariates robust, efficient, and easy to implement. The objective is to be reached through four steps: (1) A general semi-parametric odds ratio model is proposed for complex missing data problems. The proposed model makes the likelihood approach commonly used in practice more robust and flexible, and easy to apply. (2) The likelihood method for regression with missing data is further robustified in three ways. When missing patterns are relatively simple, smoothing spline models for odds ratio function is proposed; When missing patterns are complex, likelihood estimator is modified to be doubly robust and locally efficient; A framework is proposed for sensitivity analysis with general missing data mechanisms. (3) For problems with a large number of covariates subject to missing values, model selection procedures are studied based on imputed complete data under the semiparametric covariate model. Such procedures can be very helpful in studying risk factors of health events, such as in identifying risk factors of bone fracture from a set of potential risk factors subject to missing values. (4) For all the missing data problems under consideration, software for implementing methods of the research outcomes will be developed and disseminated. The proposed research, when completed, will make analyses of biomedical data with missing covariate values more accessible to researchers in many applied fields and thus promote efficient use of valuable data, such as those from HIV and cancer studies. ? ?

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Cancer Institute (NCI)
Type: Research Project (R01)
Project #: 1R01CA106355-01A2
Application #: 6986543
Study Section: Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer: Tiwari, Ram C

Project Start: 2005-06-01
Project End: 2008-05-31
Budget Start: 2005-06-01
Budget End: 2006-05-31
Support Year: 1
Fiscal Year: 2005
Total Cost: $155,386
Indirect Cost

Institution

Name: University of Illinois at Chicago
Department: Public Health & Prev Medicine
Type: Schools of Public Health
DUNS #: 098987217

City: Chicago
State: IL
Country: United States
Zip Code: 60612

Related projects


NIH 2007 R01 CA	Novel Statistical Methods for Data with Missing Values Chen, Hua Yun / University of Illinois at Chicago	$153,880
NIH 2006 R01 CA	Novel Statistical Methods for Data with Missing Values Chen, Hua Yun / University of Illinois at Chicago	$157,185
NIH 2005 R01 CA	Novel Statistical Methods for Data with Missing Values Chen, Hua Yun / University of Illinois at Chicago	$155,386

Publications

Chen, Hua Yun (2011) Representations of efficient score for coarse data problems based on Neumann series expansion. Ann Inst Stat Math 63:497-509

Chen, Hua Yun; Xie, Hui; Qian, Yi (2011) Multiple imputation for missing values through conditional Semiparametric odds ratio models. Biometrics 67:799-809

Chen, Hua Yun (2010) On L convergence of Neumann series approximation in missing data problems. Stat Probab Lett 80:864-873

Chen, Hua Yun (2010) Compatibility of conditionally specified models. Stat Probab Lett 80:670-677

Chen, Hua Yun (2009) Estimation and inference based on Neumann series approximation to locally efficient score in missing data problems. Scand Stat Theory Appl 36:713-734

Chen, Hua Yun; Gao, Shasha (2009) Estimation of average treatment effect with incompletely observed longitudinal data: application to a smoking cessation study. Stat Med 28:2451-72

Yun Chen, Hua (2007) A semiparametric odds ratio model for measuring association. Biometrics 63:413-21

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: