Understanding the association of putative risk factors with disease incidence or time to clinical events are the central themes in epidemiology and are also frequently studied in other medical research fields. Cohort studies are ideal designs for studying those problems, but costly in terms of follow-up time and resources. In survival analysis, sampling prevalent data from living individuals who have experienced a certain initial event before recruitment is another way to reduce cost compared to sampling incident cases. However, the data are subject to lead-time bias and specialized methodologies are needed for bias correction. Sampling prevalent data is frequently done in practice but a need for proper analysis is often unrecognized. Moreover, practical but complex prevalent cohort designs for studying disease incidence or case fatality are rarely studied. The research team's long-term goal is to develop a set of practically useful statistical tools to design and analyze prevalent sampled survival data for public health applications. The main objective of the proposed research is to formulate the statistical foundations of several complex prevalent designs and to propose valid statistical methods for making scientific inference. The proposed research is possible because the investigators are leading experts in cross-sectional survival analysis. We specifically aim to 1) develop valid design and analysis tools for prevalent case-control studies, by correcting a notorious survival bias in the design and to disentangle the estimation of relative disease risk and life expectancy for diseased and non-diseased individuals;2) propose and evaluate degenerate follow-up designs and prevalent current status designs where the cost of follow-up can be eliminated or greatly reduced;3) propose statistical methods to analyze biomarker data with prevalent sampling bias. The proposed research is innovative because it utilizes the investigators'recent discovery of many useful data structures in prevalent sampling that are not known in the existing biostatistics and epidemiology literature. The proposed research is significant, because the research is expected to provide practically useful tools for designing future epidemiological studies, to enhance validity of scientific conclusion based on biased sampling designs, to study clinical and scientific relevant estimates, to have a wide scope of applications, to provide researchers a different avenue to study important scientific questions when funding is reduced and to allow funding agencies to maximize the use of public resources for scientific discoveries. Page 1

Public Health Relevance

The proposed research is relevant to public health because it directly targets the fundamental concept of design of medical studies for identifying disease incidence and mortality risk factors. It is also relevant to the part of NIH's missions to foster innovative reserch strategies for studying health problems and to ensure a continued high return on the public investment in research. Page 1

National Institute of Health (NIH)
National Heart, Lung, and Blood Institute (NHLBI)
Research Project (R01)
Project #
Application #
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Wolz, Michael
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Washington
Biostatistics & Other Math Sci
Schools of Public Health
United States
Zip Code
Huang, Chiung-Yu; Wang, Chenguang; Wang, Mei-Cheng (2016) Nonparametric analysis of bivariate gap time with competing risks. Biometrics 72:780-90
Chan, Kwun Chuen Gary; Yam, Sheung Chi Phillip; Zhang, Zheng (2016) Globally efficient non-parametric inference of average treatment effects by empirical balancing calibration weighting. J R Stat Soc Series B Stat Methodol 78:673-700
Jewell, Nicholas P (2016) Natural history of diseases: Statistical designs and issues. Clin Pharmacol Ther 100:353-61
Chan, Kwun Chuen Gary (2016) Reader reaction: Instrumental variable additive hazards models with exposure-dependent censoring. Biometrics 72:1003-5
Yee, Laura M; Gary Chan, Kwun Chuen (2015) Nonparametric inference for time-dependent incremental cost-effectiveness ratios. Stat Med 34:4057-69
Chan, Kwun Chuen Gary; Qin, Jing (2015) Rank-based testing of equal survivorship based on cross-sectional survival data with or without prospective follow-up. Biostatistics 16:772-84
Yee, Laura M; Chan, Kwun Chuen Gary (2015) Nonparametric inference for the joint distribution of recurrent marked variables and recurrent survival time. Lifetime Data Anal :