Outcome dependent sampling designs are crucial for capturing outcome-exposure relationships when random sampling fails (e.g., with a rare outcome), or is infeasible (long latency period be- twin exposure and outcome). The most commonly used outcome dependent sampling design is the case-control study, and relying on the principle of sampling based on key features of the out- come distribution, a number of other outcome dependent sampling designs have emerged. While these designs are implemented effectively over a broad spectrum of scientific settings, they do not appear to be used in analyses involving longitudinal data. Longitudinal studies addresses important research questions involving relationships that occur over time, yet the outcomes may be rare (e.g., a severe adverse event), or exposures and/or outcome ascertainment may be extremely costly. The overall goal for this application is the development of outcome dependent sampling designs and analysis approaches that can be used in longitudinal studies.
Aim 1 will explore designs where longitudinal response data are readily available (e.g., ongoing cohort studies), but a key exposure (e.g. a biomarker) or adjustment covariate must be ascertained in order to carry out the analysis. Since exposure ascertainment can be costly, researchers should base subject selection on key features of the individual response vectors.
Aim 2 is distinguished from Aim 1 in that subjects are sampled prior to observing their outcome vectors. Researchers must therefore utilize available information on individual subjects in order to select the sample that will maximize the likelihood of estimating parameters efficiently (e.g., with low variance). This involves targeted sampling based on an ancillar covariate or covariate vector believed to be related the response vector.
Aim 3 differs from the other two aims in that sampling does not occur on individual subject level. Rather sampling is at the level of time. When a cohort is followed and exposure or outcome ascertainment costs are high, researchers should sample subjects at times that are most informative in order to balance improved estimation efficiency with ascertainment costs. In all three aims, the observed samples do not represent the target populations and so analysis approaches cannot be conducted as if the samples were randomly drawn. Instead data analysis strategies must be adjusted to acknowledge the study design. This research will examine sampling strategies and analysis approaches that validly and efficiently address important scientific quesitons. Within each of the aims are two sub-aims where sub-aim 1 is related to binary response vector data, and subaim 2 generalizes the results to other (continuous) response distributions.
This research will develop outcome dependent sampling study designs and analysis strategies for longitudinal data. Longitudindal studies address important scientific questions involving relation- ships that occur over time, and outcome dependent sampling designs target subjects who will be most informative towards estimating target parameters. These designs will permit efficient parameter estimation at reduced study costs.
|Gail, Mitchell H; Haneuse, Sebastien (2017) Power and sample size for multivariate logistic modeling of unmatched case-control studies. Stat Methods Med Res :962280217737157|
|Huang, Alan; Rathouz, Paul J (2017) Orthogonality of the Mean and Error Distribution in Generalized Linear Models. Commun Stat Theory Methods 46:3290-3296|
|Schildcrout, Jonathan S; Denny, Joshua C; Roden, Dan M (2017) On the Potential of Preemptive Genotyping Towards Preventing Medication-Related Adverse Events: Results from the South Korean National Health Insurance Database. Drug Saf 40:1-2|
|Schildcrout, Jonathan S; Shi, Yaping; Danciu, Ioana et al. (2016) A prognostic model based on readily available clinical data enriched a pre-emptive pharmacogenetic testing program. J Clin Epidemiol 72:107-15|
|Schildcrout, Jonathan S; Rathouz, Paul J; Zelnick, Leila R et al. (2015) BIASED SAMPLING DESIGNS TO IMPROVE RESEARCH EFFICIENCY: FACTORS INFLUENCING PULMONARY FUNCTION OVER TIME IN CHILDREN WITH ASTHMA. Ann Appl Stat 9:731-753|
|McDaniel, Lee S; Henderson, Nicholas C; Rathouz, Paul J (2013) Fast Pure R Implementation of GEE: Application of the Matrix Package. R J 5:181-187|
|Schildcrout, Jonathan S; Garbett, Shawn P; Heagerty, Patrick J (2013) Outcome vector dependent sampling with longitudinal continuous response data: stratified sampling based on summary statistics. Biometrics 69:405-16|
|Schildcrout, Jonathan S; Mumford, Sunni L; Chen, Zhen et al. (2012) Outcome-dependent sampling for longitudinal binary response data based on a time-varying auxiliary variable. Stat Med 31:2441-56|
|Huang, Alan; Rathouz, Paul J (2012) Proportional likelihood ratio models for mean regression. Biometrika 99:223-229|
|Schildcrout, Jonathan S; Heagerty, Patrick J (2011) Outcome-dependent sampling from existing cohorts with longitudinal binary response data: study planning and analysis. Biometrics 67:1583-93|
Showing the most recent 10 out of 12 publications