Outcome dependent and auxiliary variable dependent sampling designs are highly efficient when compared to standard random sampling. In an era where budgets are tightening and electronic health and large-scale cohort study data are becoming increasingly available for research, study designs that use existing resources and data to advise sampling schemes are able to concentrate resources on the most informative subjects and/or times. These targeted sampling study designs (e.g., case-control and case-cohort) are ubiquitous in many areas of medical and public health research; however there has been relatively little research on designs involving longitudinal data. Longitudinal studies can address important research questions involving within- and among-individual changes in exposures and outcomes over time. The broad goal of this application is to develop a framework for conducting cost-efficient research for longitudinal data. It will include several classes of designs that are discerned by the aims, semi-parametric and likelihood based methods for analysis, and software that will permit preemptive sample size and power calculations and analyses of ascertained data.
Aim 1 extends our earlier work on outcome dependent sampling (ODS) designs, to designs that sample based on multiple longitudinal outcomes. Such multiple ODS (MODS) designs enable low cost retrospective studies of pleiotropic effects of one or more expensive to ascertain exposure variables. In contrast to Aims 1, Aim 2 designs are prospective. They are conducted with outcome history and auxiliary variable dependent sampling (OHADS) schemes that alter within-subject sampling probabilities dynamically based data that have been accumulated. A primary feature of these designs is to allow researchers to weigh exposure and outcome ascertainment costs against the anticipated information gained with ascertainment at each time point.
Aim 3 proposes adaptive ODS (AODS) designs that retrospectively sample subjects in waves. After each wave of subjects is collected and the data summarized, the designs are modified based on the goals of the study and will often consider estimation efficiency and robustness to modeling assumptions. In contrast to all other aims that only considered two-level data (subject and time), Aim 4 considers multi-stage outcome dependent and auxiliary variable dependent sampling (mSODS and mSADS) of hierarchical and hierarchical longitudinal data. In these designs sampling occurs are multiple levels of the hierarchical data.

Public Health Relevance

This application is a competing renewal that will build upon the research completed during the first funding cycle to develop a broad framework for conducting targeted sampling study designs for longitudinal data. Targeted sampling designs that exploit pre-existing data can be highly cost and resource efficient compared to standard designs; however, valid analyses must acknowledge the non-representativeness of the sample that was observed. While targeted designs are ubiquitous in many epidemiological, medical, and genetic applications, their implementation in longitudinal data settings is rare due to the lac of research methods and analytical tools.

Agency
National Institute of Health (NIH)
Institute
National Heart, Lung, and Blood Institute (NHLBI)
Type
Research Project (R01)
Project #
5R01HL094786-06
Application #
9271993
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Wolz, Michael
Project Start
2009-09-01
Project End
2020-04-30
Budget Start
2017-05-01
Budget End
2018-04-30
Support Year
6
Fiscal Year
2017
Total Cost
Indirect Cost
Name
Vanderbilt University Medical Center
Department
Type
DUNS #
079917897
City
Nashville
State
TN
Country
United States
Zip Code
37232
Haneuse, Sebastien; Rivera-Rodriguez, Claudia (2018) On the Analysis of Case-Control Studies in Cluster-correlated Data Settings. Epidemiology 29:50-57
Schildcrout, Jonathan S; Schisterman, Enrique F; Aldrich, Melinda C et al. (2018) Outcome-related, Auxiliary Variable Sampling Designs for Longitudinal Binary Data. Epidemiology 29:58-66
Schildcrout, Jonathan S; Schisterman, Enrique F; Mercaldo, Nathaniel D et al. (2018) Extending the Case-Control Design to Longitudinal Data: Stratified Sampling Based on Repeated Binary Outcomes. Epidemiology 29:67-75
Rivera-Rodriguez, Claudia L; Resch, Stephen; Haneuse, Sebastien (2018) Quantifying and reducing statistical uncertainty in sample-based health program costing studies in low- and middle-income countries. SAGE Open Med 6:2050312118765602
Gail, Mitchell H; Haneuse, Sebastien (2017) Power and sample size for multivariate logistic modeling of unmatched case-control studies. Stat Methods Med Res :962280217737157
Huang, Alan; Rathouz, Paul J (2017) Orthogonality of the Mean and Error Distribution in Generalized Linear Models. Commun Stat Theory Methods 46:3290-3296
Schildcrout, Jonathan S; Denny, Joshua C; Roden, Dan M (2017) On the Potential of Preemptive Genotyping Towards Preventing Medication-Related Adverse Events: Results from the South Korean National Health Insurance Database. Drug Saf 40:1-2
Schildcrout, Jonathan S; Shi, Yaping; Danciu, Ioana et al. (2016) A prognostic model based on readily available clinical data enriched a pre-emptive pharmacogenetic testing program. J Clin Epidemiol 72:107-15
Schildcrout, Jonathan S; Rathouz, Paul J; Zelnick, Leila R et al. (2015) BIASED SAMPLING DESIGNS TO IMPROVE RESEARCH EFFICIENCY: FACTORS INFLUENCING PULMONARY FUNCTION OVER TIME IN CHILDREN WITH ASTHMA. Ann Appl Stat 9:731-753
McDaniel, Lee S; Henderson, Nicholas C; Rathouz, Paul J (2013) Fast Pure R Implementation of GEE: Application of the Matrix Package. R J 5:181-187

Showing the most recent 10 out of 16 publications