Outcome dependent sampling designs are crucial for capturing outcome-exposure relationships when random sampling fails (e.g., with a rare outcome), or is infeasible (long latency period be- twin exposure and outcome). The most commonly used outcome dependent sampling design is the case-control study, and relying on the principle of sampling based on key features of the out- come distribution, a number of other outcome dependent sampling designs have emerged. While these designs are implemented effectively over a broad spectrum of scientific settings, they do not appear to be used in analyses involving longitudinal data. Longitudinal studies addresses important research questions involving relationships that occur over time, yet the outcomes may be rare (e.g., a severe adverse event), or exposures and/or outcome ascertainment may be extremely costly. The overall goal for this application is the development of outcome dependent sampling designs and analysis approaches that can be used in longitudinal studies.
Aim 1 will explore designs where longitudinal response data are readily available (e.g., ongoing cohort studies), but a key exposure (e.g. a biomarker) or adjustment covariate must be ascertained in order to carry out the analysis. Since exposure ascertainment can be costly, researchers should base subject selection on key features of the individual response vectors.
Aim 2 is distinguished from Aim 1 in that subjects are sampled prior to observing their outcome vectors. Researchers must therefore utilize available information on individual subjects in order to select the sample that will maximize the likelihood of estimating parameters efficiently (e.g., with low variance). This involves targeted sampling based on an ancillar covariate or covariate vector believed to be related the response vector.
Aim 3 differs from the other two aims in that sampling does not occur on individual subject level. Rather sampling is at the level of time. When a cohort is followed and exposure or outcome ascertainment costs are high, researchers should sample subjects at times that are most informative in order to balance improved estimation efficiency with ascertainment costs. In all three aims, the observed samples do not represent the target populations and so analysis approaches cannot be conducted as if the samples were randomly drawn. Instead data analysis strategies must be adjusted to acknowledge the study design. This research will examine sampling strategies and analysis approaches that validly and efficiently address important scientific quesitons. Within each of the aims are two sub-aims where sub-aim 1 is related to binary response vector data, and subaim 2 generalizes the results to other (continuous) response distributions.
This research will develop outcome dependent sampling study designs and analysis strategies for longitudinal data. Longitudindal studies address important scientific questions involving relation- ships that occur over time, and outcome dependent sampling designs target subjects who will be most informative towards estimating target parameters. These designs will permit efficient parameter estimation at reduced study costs.
|Schildcrout, Jonathan S; Garbett, Shawn P; Heagerty, Patrick J (2013) Outcome vector dependent sampling with longitudinal continuous response data: stratified sampling based on summary statistics. Biometrics 69:405-16|
|Schildcrout, Jonathan S; Haneuse, Sebastien; Peterson, Josh F et al. (2011) Analyses of longitudinal, hospital clinical laboratory data with application to blood glucose concentrations. Stat Med 30:3208-20|