Outcome Dependent Sampling Studies of Longitudinal Data:  Design and Analysis

Schildcrout, Jonathan

Abstract

Outcome dependent sampling designs are crucial for capturing outcome-exposure relationships when random sampling fails (e.g., with a rare outcome), or is infeasible (long latency period be- twin exposure and outcome). The most commonly used outcome dependent sampling design is the case-control study, and relying on the principle of sampling based on key features of the out- come distribution, a number of other outcome dependent sampling designs have emerged. While these designs are implemented effectively over a broad spectrum of scientific settings, they do not appear to be used in analyses involving longitudinal data. Longitudinal studies addresses important research questions involving relationships that occur over time, yet the outcomes may be rare (e.g., a severe adverse event), or exposures and/or outcome ascertainment may be extremely costly. The overall goal for this application is the development of outcome dependent sampling designs and analysis approaches that can be used in longitudinal studies.
Aim 1 will explore designs where longitudinal response data are readily available (e.g., ongoing cohort studies), but a key exposure (e.g. a biomarker) or adjustment covariate must be ascertained in order to carry out the analysis. Since exposure ascertainment can be costly, researchers should base subject selection on key features of the individual response vectors.
Aim 2 is distinguished from Aim 1 in that subjects are sampled prior to observing their outcome vectors. Researchers must therefore utilize available information on individual subjects in order to select the sample that will maximize the likelihood of estimating parameters efficiently (e.g., with low variance). This involves targeted sampling based on an ancillar covariate or covariate vector believed to be related the response vector.
Aim 3 differs from the other two aims in that sampling does not occur on individual subject level. Rather sampling is at the level of time. When a cohort is followed and exposure or outcome ascertainment costs are high, researchers should sample subjects at times that are most informative in order to balance improved estimation efficiency with ascertainment costs. In all three aims, the observed samples do not represent the target populations and so analysis approaches cannot be conducted as if the samples were randomly drawn. Instead data analysis strategies must be adjusted to acknowledge the study design. This research will examine sampling strategies and analysis approaches that validly and efficiently address important scientific quesitons. Within each of the aims are two sub-aims where sub-aim 1 is related to binary response vector data, and subaim 2 generalizes the results to other (continuous) response distributions.

Public Health Relevance

This research will develop outcome dependent sampling study designs and analysis strategies for longitudinal data. Longitudindal studies address important scientific questions involving relation- ships that occur over time, and outcome dependent sampling designs target subjects who will be most informative towards estimating target parameters. These designs will permit efficient parameter estimation at reduced study costs.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Heart, Lung, and Blood Institute (NHLBI)
Type: Research Project (R01)
Project #: 5R01HL094786-02
Application #: 7922719
Study Section: Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer: Wolz, Michael

Project Start: 2009-09-01
Project End: 2013-05-31
Budget Start: 2010-06-01
Budget End: 2011-05-31
Support Year: 2
Fiscal Year: 2010
Total Cost: $194,042
Indirect Cost

Institution

Name: Vanderbilt University Medical Center
Department: Biostatistics & Other Math Sci
Type: Schools of Medicine
DUNS #: 004413456

City: Nashville
State: TN
Country: United States
Zip Code: 37212

Related projects


NIH 2019 R01 HL	Outcome Dependent Sampling of Longitudinal Data: Design and Analysis Schildcrout, Jonathan Scott / Vanderbilt University Medical Center
NIH 2018 R01 HL	Outcome Dependent Sampling of Longitudinal Data: Design and Analysis Schildcrout, Jonathan Scott / Vanderbilt University Medical Center
NIH 2017 R01 HL	Outcome Dependent Sampling of Longitudinal Data: Design and Analysis Schildcrout, Jonathan Scott / Vanderbilt University Medical Center
NIH 2016 R01 HL	Outcome Dependent Sampling of Longitudinal Data: Design and Analysis Schildcrout, Jonathan Scott / Vanderbilt University Medical Center
NIH 2012 R01 HL	Outcome Dependent Sampling Studies of Longitudinal Data: Design and Analysis Schildcrout, Jonathan Scott / Vanderbilt University Medical Center	$190,694
NIH 2011 R01 HL	Outcome Dependent Sampling Studies of Longitudinal Data: Design and Analysis Schildcrout, Jonathan Scott / Vanderbilt University Medical Center	$192,615
NIH 2010 R01 HL	Outcome Dependent Sampling Studies of Longitudinal Data: Design and Analysis Schildcrout, Jonathan Scott / Vanderbilt University Medical Center	$194,042
NIH 2009 R01 HL	Outcome Dependent Sampling Studies of Longitudinal Data: Design and Analysis Schildcrout, Jonathan Scott / Vanderbilt University Medical Center	$215,251

Publications

Haneuse, Sebastien; Rivera-Rodriguez, Claudia (2018) On the Analysis of Case-Control Studies in Cluster-correlated Data Settings. Epidemiology 29:50-57

Schildcrout, Jonathan S; Schisterman, Enrique F; Aldrich, Melinda C et al. (2018) Outcome-related, Auxiliary Variable Sampling Designs for Longitudinal Binary Data. Epidemiology 29:58-66

Schildcrout, Jonathan S; Schisterman, Enrique F; Mercaldo, Nathaniel D et al. (2018) Extending the Case-Control Design to Longitudinal Data: Stratified Sampling Based on Repeated Binary Outcomes. Epidemiology 29:67-75

Rivera-Rodriguez, Claudia L; Resch, Stephen; Haneuse, Sebastien (2018) Quantifying and reducing statistical uncertainty in sample-based health program costing studies in low- and middle-income countries. SAGE Open Med 6:2050312118765602

Gail, Mitchell H; Haneuse, Sebastien (2017) Power and sample size for multivariate logistic modeling of unmatched case-control studies. Stat Methods Med Res :962280217737157

Huang, Alan; Rathouz, Paul J (2017) Orthogonality of the Mean and Error Distribution in Generalized Linear Models. Commun Stat Theory Methods 46:3290-3296

Schildcrout, Jonathan S; Denny, Joshua C; Roden, Dan M (2017) On the Potential of Preemptive Genotyping Towards Preventing Medication-Related Adverse Events: Results from the South Korean National Health Insurance Database. Drug Saf 40:1-2

Schildcrout, Jonathan S; Shi, Yaping; Danciu, Ioana et al. (2016) A prognostic model based on readily available clinical data enriched a pre-emptive pharmacogenetic testing program. J Clin Epidemiol 72:107-15

Schildcrout, Jonathan S; Rathouz, Paul J; Zelnick, Leila R et al. (2015) BIASED SAMPLING DESIGNS TO IMPROVE RESEARCH EFFICIENCY: FACTORS INFLUENCING PULMONARY FUNCTION OVER TIME IN CHILDREN WITH ASTHMA. Ann Appl Stat 9:731-753

McDaniel, Lee S; Henderson, Nicholas C; Rathouz, Paul J (2013) Fast Pure R Implementation of GEE: Application of the Matrix Package. R J 5:181-187

Showing the most recent 10 out of 16 publications

Comments

Be the first to comment on Jonathan Schildcrout's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: