Outcome dependent sampling designs are crucial for capturing outcome-exposure relationships when random sampling fails (e.g., with a rare outcome), or is infeasible (long latency period be- twin exposure and outcome). The most commonly used outcome dependent sampling design is the case-control study, and relying on the principle of sampling based on key features of the out- come distribution, a number of other outcome dependent sampling designs have emerged. While these designs are implemented effectively over a broad spectrum of scientific settings, they do not appear to be used in analyses involving longitudinal data. Longitudinal studies addresses important research questions involving relationships that occur over time, yet the outcomes may be rare (e.g., a severe adverse event), or exposures and/or outcome ascertainment may be extremely costly. The overall goal for this application is the development of outcome dependent sampling designs and analysis approaches that can be used in longitudinal studies.
Aim 1 will explore designs where longitudinal response data are readily available (e.g., ongoing cohort studies), but a key exposure (e.g. a biomarker) or adjustment covariate must be ascertained in order to carry out the analysis. Since exposure ascertainment can be costly, researchers should base subject selection on key features of the individual response vectors.
Aim 2 is distinguished from Aim 1 in that subjects are sampled prior to observing their outcome vectors. Researchers must therefore utilize available information on individual subjects in order to select the sample that will maximize the likelihood of estimating parameters efficiently (e.g., with low variance). This involves targeted sampling based on an ancillar covariate or covariate vector believed to be related the response vector.
Aim 3 differs from the other two aims in that sampling does not occur on individual subject level. Rather sampling is at the level of time. When a cohort is followed and exposure or outcome ascertainment costs are high, researchers should sample subjects at times that are most informative in order to balance improved estimation efficiency with ascertainment costs. In all three aims, the observed samples do not represent the target populations and so analysis approaches cannot be conducted as if the samples were randomly drawn. Instead data analysis strategies must be adjusted to acknowledge the study design. This research will examine sampling strategies and analysis approaches that validly and efficiently address important scientific quesitons. Within each of the aims are two sub-aims where sub-aim 1 is related to binary response vector data, and subaim 2 generalizes the results to other (continuous) response distributions.

Public Health Relevance

This research will develop outcome dependent sampling study designs and analysis strategies for longitudinal data. Longitudindal studies address important scientific questions involving relation- ships that occur over time, and outcome dependent sampling designs target subjects who will be most informative towards estimating target parameters. These designs will permit efficient parameter estimation at reduced study costs.

Agency
National Institute of Health (NIH)
Institute
National Heart, Lung, and Blood Institute (NHLBI)
Type
Research Project (R01)
Project #
5R01HL094786-02
Application #
7922719
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Wolz, Michael
Project Start
2009-09-01
Project End
2013-05-31
Budget Start
2010-06-01
Budget End
2011-05-31
Support Year
2
Fiscal Year
2010
Total Cost
$194,042
Indirect Cost
Name
Vanderbilt University Medical Center
Department
Biostatistics & Other Math Sci
Type
Schools of Medicine
DUNS #
004413456
City
Nashville
State
TN
Country
United States
Zip Code
37212
Schildcrout, Jonathan S; Schisterman, Enrique F; Mercaldo, Nathaniel D et al. (2018) Extending the Case-Control Design to Longitudinal Data: Stratified Sampling Based on Repeated Binary Outcomes. Epidemiology 29:67-75
Rivera-Rodriguez, Claudia L; Resch, Stephen; Haneuse, Sebastien (2018) Quantifying and reducing statistical uncertainty in sample-based health program costing studies in low- and middle-income countries. SAGE Open Med 6:2050312118765602
Haneuse, Sebastien; Rivera-Rodriguez, Claudia (2018) On the Analysis of Case-Control Studies in Cluster-correlated Data Settings. Epidemiology 29:50-57
Schildcrout, Jonathan S; Schisterman, Enrique F; Aldrich, Melinda C et al. (2018) Outcome-related, Auxiliary Variable Sampling Designs for Longitudinal Binary Data. Epidemiology 29:58-66
Gail, Mitchell H; Haneuse, Sebastien (2017) Power and sample size for multivariate logistic modeling of unmatched case-control studies. Stat Methods Med Res :962280217737157
Huang, Alan; Rathouz, Paul J (2017) Orthogonality of the Mean and Error Distribution in Generalized Linear Models. Commun Stat Theory Methods 46:3290-3296
Schildcrout, Jonathan S; Denny, Joshua C; Roden, Dan M (2017) On the Potential of Preemptive Genotyping Towards Preventing Medication-Related Adverse Events: Results from the South Korean National Health Insurance Database. Drug Saf 40:1-2
Schildcrout, Jonathan S; Shi, Yaping; Danciu, Ioana et al. (2016) A prognostic model based on readily available clinical data enriched a pre-emptive pharmacogenetic testing program. J Clin Epidemiol 72:107-15
Schildcrout, Jonathan S; Rathouz, Paul J; Zelnick, Leila R et al. (2015) BIASED SAMPLING DESIGNS TO IMPROVE RESEARCH EFFICIENCY: FACTORS INFLUENCING PULMONARY FUNCTION OVER TIME IN CHILDREN WITH ASTHMA. Ann Appl Stat 9:731-753
McDaniel, Lee S; Henderson, Nicholas C; Rathouz, Paul J (2013) Fast Pure R Implementation of GEE: Application of the Matrix Package. R J 5:181-187

Showing the most recent 10 out of 16 publications