There has been growing interest in developing innovative statistical models for dependent data. The project aims to address the following issues which have arisen in modeling correlated data: extra zeros in count data, non-linear functional relationships and measurement errors. Frequently it is of interest to make marginal inference about trends and effects of explanatory variables on correlated counts. Failure to account for the extra zeros may result in biased parameter estimates and misleading inferences. The investigator proposes a generalization of the standard zero-inflated regression model to the correlated data case, allowing for a Heckman-type selection process. Measurement Error Models offer coverage of estimation for situations where the model variables are observed subject to measurement error. The second part of the project focuses on developing Semiparametric Measurement Error Model for dependent data using two approaches: the first is based upon the idea of Monte Carlo corrected scores and the second is a generalization of the SIMEX (simulation-extrapolation) method to general semiparametric models.

Semiparametric regression is concerned with the flexible incorporation of non-linear functional relationships in regression analyses. The investigator is particularly interested in correlated data which arise in a variety of settings in many areas of applications: Biology, Genetics, Bioinformatics, Biostatistics, Medicine, Econometrics, Engineering and Sociology. In many practical situations the attribute or event of interest is rare and/or other variables preclude observation of an event, consequently a higher proportion of the counts than implied by standard models may equal zero. Part of the focus of this proposal is to develop semiparametric models for the correlated data with extra zeros. When there is an uncertainty in measuring covariates, the usual regression estimators are biased and when the measurement error is substantial, alternative procedures are necessary. Part of the emphasis of the proposal is on developing semiparametric measurement error regression models for dependent data. The statistical methodology to be developed will be circulated to the statistical community in academia and industry through series of papers and publicly available programming code. On the educational front, some of the research material in this proposal will be incorporated in courses at the undergraduate and graduate level. Some projects will also serve as dissertation topics for Ph.D. advisees and will therefore play an important role in the training of future statisticians.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
0707106
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2007-07-01
Budget End
2009-04-30
Support Year
Fiscal Year
2007
Total Cost
$91,403
Indirect Cost
Name
Cornell University
Department
Type
DUNS #
City
Ithaca
State
NY
Country
United States
Zip Code
14850