Due to recent computational and theoretical advances, researchers now can analyze longitudinal and family data by using sophisticated parametric and semiparametric models. Effective application of these models, however, has been hindered by the lack of well-developed diagnostic tools for testing the validity of their assumptions. Scientists have thus faced the prospect of interpreting results from models that have not been validated adequately. The aim of this project is to develop, evaluate, and implement new diagnostic tools for checking assumptions of parametric and semiparametric models for longitudinal and family data, with a specific focus on improving inferences about covariance structure of these models and their goodness-of-fit to the data sets to which they are applied. The project specifically aims to develop: a local influence approach; new first-order and second-order residual diagnostics for assessing mean and covariance structure of parametric and semiparametric models; diagnostic tools for assessing empirical likelihood; and score test statistics for selecting random-effects components and for testing parametric functions in semiparametric models. As these methods are developed, they will be evaluated and refined through extensive Monte Carlo simulations and data analysis. The efficacy of the tools developed under this award will be demonstrated by applying them to two data sets from longitudinal family studies. Software programs for implementing all of these tools, once tested, will be made available to the public via the internet.

This project will provide much needed data-mining tools for application in longitudinal and family studies. These statistical tools will help scientists to choose an "appropriate" model for the data of a given study, thus maximizing the likelihood of drawing correct scientific conclusions. The companion software package will allow these tools to be disseminated widely and applied to a broad range of populations under study within the behavioral, social, and economic sciences. This project will apply these new methods to two large databases to address problems of public health and social importance. Results from these applications will show our methods to be useful tools for studying a wide range of issues, including: identification of risk factors for cancer, assessment of the impact of substance use and environmental factors on the neuropsychological development of children, examination of factors associated with the lifespan of system components, and identification of candidate genes for nicotin dependence and its comorbidity with alcohol use and psychiatric disorders.

Agency
National Science Foundation (NSF)
Institute
Division of Social and Economic Sciences (SES)
Type
Standard Grant (Standard)
Application #
0550988
Program Officer
Cheryl L. Eavey
Project Start
Project End
Budget Start
2006-04-01
Budget End
2006-10-31
Support Year
Fiscal Year
2005
Total Cost
$135,000
Indirect Cost
Name
Columbia University
Department
Type
DUNS #
City
New York
State
NY
Country
United States
Zip Code
10027