Understanding the association of putative risk factors with disease incidence or time to clinical events are the central themes in epidemiology and are also frequently studied in other medical research fields. Cohort studies are ideal designs for studying those problems, but costly in terms of follow-up time and resources. In survival analysis, sampling prevalent data from living individuals who have experienced a certain initial event before recruitment is another way to reduce cost compared to sampling incident cases. However, the data are subject to lead-time bias and specialized methodologies are needed for bias correction. Sampling prevalent data is frequently done in practice but a need for proper analysis is often unrecognized. Moreover, practical but complex prevalent cohort designs for studying disease incidence or case fatality are rarely studied. The research team's long-term goal is to develop a set of practically useful statistical tools to design and analyze prevalent sampled survival data for public health applications. The main objective of the proposed research is to formulate the statistical foundations of several complex prevalent designs and to propose valid statistical methods for making scientific inference. The proposed research is possible because the investigators are leading experts in cross-sectional survival analysis. We specifically aim to 1) develop valid design and analysis tools for prevalent case-control studies, by correcting a notorious survival bias in the design and to disentangle the estimation of relative disease risk and life expectancy for diseased and non-diseased individuals;2) propose and evaluate degenerate follow-up designs and prevalent current status designs where the cost of follow-up can be eliminated or greatly reduced;3) propose statistical methods to analyze biomarker data with prevalent sampling bias. The proposed research is innovative because it utilizes the investigators'recent discovery of many useful data structures in prevalent sampling that are not known in the existing biostatistics and epidemiology literature. The proposed research is significant, because the research is expected to provide practically useful tools for designing future epidemiological studies, to enhance validity of scientific conclusion based on biased sampling designs, to study clinical and scientific relevant estimates, to have a wide scope of applications, to provide researchers a different avenue to study important scientific questions when funding is reduced and to allow funding agencies to maximize the use of public resources for scientific discoveries. Page 1

Public Health Relevance

The proposed research is relevant to public health because it directly targets the fundamental concept of design of medical studies for identifying disease incidence and mortality risk factors. It is also relevant to the part of NIH's missions to foster innovative reserch strategies for studying health problems and to ensure a continued high return on the public investment in research. Page 1

National Institute of Health (NIH)
National Heart, Lung, and Blood Institute (NHLBI)
Research Project (R01)
Project #
Application #
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Wolz, Michael
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Washington
Biostatistics & Other Math Sci
Schools of Public Health
United States
Zip Code
Bai, Jiawei; Sun, Yifei; Schrack, Jennifer A et al. (2018) A two-stage model for wearable device data. Biometrics 74:744-752
Chan, Kwun Chuen Gary; Ling, Hok Kan; Sit, Tony et al. (2018) ESTIMATION OF A MONOTONE DENSITY IN S-SAMPLE BIASED SAMPLING MODELS. Ann Stat 46:2125-2152
Sun, Yifei; Chan, Kwun Chuen Gary; Qin, Jing (2018) Simple and fast overidentified rank estimation for right-censored length-biased data and backward recurrence time. Biometrics 74:77-85
Wong, Raymond K W; Chan, Kwun Chuen Gary (2018) Kernel-based covariate functional balancing for observational studies. Biometrika 105:199-213
Cai, Qing; Wang, Mei-Cheng; Chan, Kwun Chuen Gary (2017) Joint modeling of longitudinal, recurrent events and failure time data for survivor's population. Biometrics 73:1150-1160
Huang, Ming-Yueh; Chan, Kwun Chuen Gary (2017) Joint sufficient dimension reduction and estimation of conditional and average treatment effects. Biometrika 104:583-596
Sun, Yifei; Wang, Mei-Cheng (2017) Evaluating Utility Measurement from Recurrent Marker Processes in the Presence of Competing Terminal Events. J Am Stat Assoc 112:745-756
Chan, Kwun Chuen Gary; Wang, Mei-Cheng (2017) Semiparametric modeling and estimation of the terminal behavior of recurrent marker processes before failure events. J Am Stat Assoc 112:351-362
Chan, Kwun Chuen Gary (2017) Acceleration of Expectation-Maximization algorithm for length-biased right-censored data. Lifetime Data Anal 23:102-112
Yee, Laura M; Chan, Kwun Chuen Gary (2017) Nonparametric inference for the joint distribution of recurrent marked variables and recurrent survival time. Lifetime Data Anal 23:207-222

Showing the most recent 10 out of 18 publications