In many cancer epidemiology cohort studies, the disease of interest is often rare but the study hypothesis is complex with a large number of risk factors. These studies usually require a long-term follow-up to obtain an adequate number of cancer events and to elucidate the course of the disease. Therefore, it can be prohibitively expensive to assemble data for the entire cohort. Nested case-control (NCC) design is a popular sampling method prominently due to its cost-effectiveness. In practice, NCC data are commonly analyzed using Cox's proportional hazards (PH) model. A direct consequence of the PH model is that the ratio of hazard functions with different covariate values is assumed to remain constant over the entire follow-up period. Due to the nature of long-term observation and complexity of the relationship to be explored in large-scale cancer studies, the proportional hazards assumption may easily be violated. Therefore, extensions of Cox's model to accommodate time-varying covariate effects not only are necessary to improve the modeling 0exibility but also are critical to elucidate the etiology of cancer. However, methodology developments for such 0exible models have been mainly focused on cohort studies and their uses in NCC studies remain limited. In this project, we propose to study the Cox model with time-varying coe1cients to characterize temporal effects of cancer risk factors in NCC studies.
In Aim 1, we propose to develop statistical methodologies to estimate the time-varying coe1cient functions using a kernel-weighted partial likelihood approach;to construct point-wise and simultaneous confidence intervals of the estimated time-varying coe1cients;to test and identify the existence of time-varying effect of specific risk factor;and to investigate the variable selection problem in the Cox model with time-varying coe1cients for NCC data. Once the asymptotic properties of the proposed method are established in theory and the inference procedures are validated using extensive Monte Carlo simulation studies, we can implement our proposed approaches to pursue Aim 2, which will focus on real data analyses and software development. The first part of Aim 2 will be accomplished through collaborations with the New York University Women Health Study (NYUWHS). Then developing and contributing an open-source R package will make the proposed methodologies freely available to practical researchers. Successful completion of the proposed studies will provide a series of advanced statistical inference approaches to elucidating the temporal effects of risk factors on the cancer development for NCC studies, which will also substantially improve our modeling 0exibility in the analysis of NCC data, and can assess and validate the results obtained from other methods. Furthermore, the application in NYUWHS will provide us new insights and better understanding on the effects of potential risk factors in cancer etiology. The fund mental contribution of development of freely-available software package is that it will translate the advanced statistical methodologies into practically useful and accessible tools.

Public Health Relevance

PROJECT NARRATIVE: Nested case-control design, a cost-effective sampling method commonly used in cancer epidemiologic studies, necessitates developing 0exible statistical approaches to evaluate the association between cancer and risk factors. This research project proposes to develop statistical models and inference approaches to accommodating and characterizing temporal effects of cancer risk factors for NCC studies, to provide new aspects and novel insights into the temporal relation between disease and its risk factors, and to elucidate our understanding of cancer etiology. Furthermore, contributing freely available software is essential to equip practical investigators with alternative tools to analyze NCC data and to assess, compare and validate study results.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Small Research Grants (R03)
Project #
Application #
Study Section
Special Emphasis Panel (ZCA1-SRLB-D (M1))
Program Officer
Verma, Mukesh
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
New York University
Public Health & Prev Medicine
Schools of Medicine
New York
United States
Zip Code
Liu, Mengling; Lu, Wenbin; Krogh, Vittorio et al. (2013) Estimation and selection of complex covariate effects in pooled nested case-control studies with heterogeneity. Biostatistics 14:682-94
Shang, Shulian; Liu, Mengling; Zeleniuch-Jacquotte, Anne et al. (2013) Partially Linear Single Index Cox Regression Model in Nested Case-Control Studies. Comput Stat Data Anal 67:199-212
Lu, Wenbin; Liu, Mengling (2012) On estimation of linear transformation models with nested case-control sampling. Lifetime Data Anal 18:80-93
Li, Xiaochun; Liu, Mengling; Goldberg, Judith D (2011) A note on monotonicity assumptions for exact unconditional tests in binary matched-pairs designs. Biometrics 67:1666-8
Liu, Mengling; Lu, Wenbin; Tseng, Chi-Hong (2010) Cox regression in nested case-control studies with auxiliary covariates. Biometrics 66:374-81
Liu, Mengling; Lu, Wenbin; Shore, Roy E et al. (2010) Cox regression model with time-varying coefficients in nested case-control studies. Biostatistics 11:693-706