We propose innovative and cost-effective sampling designs that will enable the investigators to collect more informative samples at a fixed budget. We will also develop/evaluate new and efficient statistical methods that will reap the gains provided by these designs. User-friendly software and algorithms of the proposed designs/methods will be developed and disseminated. The proposed designs are generally multi-stage based and are of a biased sampling scheme where one observes the main exposure variable with a probability that depends on the outcome variable and auxiliary covariates. The proposed research is in response to the needs in design a more powerful study and these designs have been used in the current ongoing studies. These studies are collaborations the PI has with researchers at National Institute of Environmental Health Sciences to study the effects of environmental exposures on cancer and other diseases.
The specific aims i nclude: (1) Develop a FDS and two-stage FDS design for the Norwegian Mother and Child Cohort Study (MoBA) and evaluate an estimated empirical likelihood method. Data from MoBa study as well as the Cancer Risk in Uranium Miners Study will be analyzed;(2) Pro- pose a two-phase probability-dependent sampling (PDS) design for the Gulf Oil Spill Long-term Follow Up Study (GuLF) and evaluate a linear regression analysis and linear mixed model with PDS design. Data from GuLF Study will be analyzed with these methods. (3) Develop a two-stage Longitudinal outcome dependent sampling (LODS) sampling design for the Generation R Study with either a baseline response based sampling scheme or summation of all responses based sampling scheme. Data from Generation R Study as well as Collaborative Perinatal Project (CPP) will be analyzed with these methods;(4) Develop inference procedure for a continuous secondary response in a two-stage ODS design and a two-stage FDS design;Data from CPP Study, MoBa Study, and Uranium Miners Study will be analyzed;(5) Power study and optimal sample size allocation to achieve the maximum power for a given budget for studies with a two-stage FDS and two-phase PDS designs. The strengths and weaknesses of each proposed method will be critically examined via theoretical investigations and simulation studies. The developed software will be made available through publication and dedicated web page which will come with """"""""User's Guide"""""""" as well as illustrative data examples on how to use them. Successful completion of the proposed research will have a significant impact on how future cost-effective biomedical studies to be conducted and how data from these studies be efficiently analyzed.

Public Health Relevance

We propose innovative and cost-effective sampling designs that will enable the investigators to sample more informative samples at a fixed budget. We will also develop/evaluate new and efficient statistical methods that will reap the gains provided by these designs. User-friendly softwares will be developed and disseminated. These designs have been used in the current ongoing epidemiological studies to investigate effects of environmental exposures on cancer and other diseases. Successful completion of the proposed research will have a significant impact on how future biomedical studies to be conducted and how data from these studies be analyzed.

Agency
National Institute of Health (NIH)
Institute
National Institute of Environmental Health Sciences (NIEHS)
Type
Research Project (R01)
Project #
5R01ES021900-11
Application #
8727550
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Dilworth, Caroline H
Project Start
2013-09-01
Project End
2017-05-31
Budget Start
2014-06-01
Budget End
2015-05-31
Support Year
11
Fiscal Year
2014
Total Cost
$266,764
Indirect Cost
$88,564
Name
University of North Carolina Chapel Hill
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
608195277
City
Chapel Hill
State
NC
Country
United States
Zip Code
27599
Pan, Yinghao; Cai, Jianwen; Longnecker, Matthew P et al. (2018) Secondary outcome analysis for data from an outcome-dependent sampling design. Stat Med 37:2321-2337
Kim, Soyoung; Zeng, Donglin; Cai, Jianwen (2018) Analysis of multiple survival events in generalized case-cohort designs. Biometrics :
Chen, Xiaolin; Cai, Jianwen (2018) Reweighted estimators for additive hazard model with censoring indicators missing at random. Lifetime Data Anal 24:224-249
Ni, Ai; Cai, Jianwen (2018) Tuning Parameter Selection in Cox Proportional Hazards Model with a Diverging Number of Parameters. Scand Stat Theory Appl 45:557-570
Pan, Yinghao; Cai, Jianwen; Kim, Sangmi et al. (2018) Regression analysis for secondary response variable in a case-cohort study. Biometrics 74:1014-1022
Ni, Ai; Cai, Jianwen (2018) A regularized variable selection procedure in additive hazards model with stratified case-cohort design. Lifetime Data Anal 24:443-463
Zhou, Qingning; Cai, Jianwen; Zhou, Haibo (2018) Outcome-dependent sampling with interval-censored failure time data. Biometrics 74:58-67
Wang, Ting; Wang, Xiaofei; Zhou, Haibo et al. (2018) Auxiliary variable-enriched biomarker-stratified design. Stat Med 37:4610-4635
Burbank, Allison J; Duran, Charity G; Pan, Yinghao et al. (2018) Gamma tocopherol-enriched supplement reduces sputum eosinophilia and endotoxin-induced sputum neutrophilia in volunteers with asthma. J Allergy Clin Immunol 141:1231-1238.e1
Shook-Sa, Bonnie E; Chen, Ding-Geng; Zhou, Haibo (2017) Using Structural Equation Modeling to Assess the Links between Tobacco Smoke Exposure, Volatile Organic Compounds, and Respiratory Function for Adolescents Aged 6 to 18 in the United States. Int J Environ Res Public Health 14:

Showing the most recent 10 out of 33 publications