A reliable and precise prognosis is fundamental for successful disease management and treatment selection. More aggressive intervention can be given to patients who are at high risk of early disease onset, while patients who are unlikely to respond to one treatment should be considered for alternative options. With the rapid advancement of technology, a wide range of biological and genomic markers have emerged as potential tools for improving the prediction of disease and treatment outcomes, and may lead to personalized, tailored medicine. New technologies such as DNA sequencing and microarrays are generating detailed data with exponentially increasing dimensionality and complexity. These data presents unprecedented opportunities and great challenges for making accurate prediction of clinical outcomes. To take full advantage of such data, this proposal aims to develop statistical approaches to efficiently construct and evaluate prognostic tools for disease risk assessment and treatment selection. Specifically, in Aim 1, we will develop accurate risk prediction models by incorporating complex interactive effects via a kernel machine regression framework. We will also provide non-parametric procedures for assessing the predictive performance of the resulting models.
In Aim 2, we propose inference procedures for absolute risks and prediction performance of new markers using two-phase studies.
In Aim 3, we develop systematic procedures for identifying subgroups of patients who may or may not benefit from a new treatment using patient level baseline marker information.
In Aim 4, we focus on high dimensional regression and develop regularized resampling methods to construct confidence intervals and hypothesis testing procedures for regression coefficients and the prediction performance of estimated models. To increase the practical impact of our research, in addition to creating software for public use, we will apply the proposed procedures to predict individual risk of developing (i) rheumatoid arthritis among women using the Nurse's Health Study (NHS);(ii) CVD among diabetic patients using the NHS and the Health Professional Follow-up Study;(iii) AIDS defining events among HIV infected patients using a large immunogenetic study;and (iv) CHD or stroke using the Women's Health Initiative (WHI) study. We also plan to develop algorithms to identify cases of various autoimmune diseases using electronic medical record (EMR) data from two large hospitals in Boston. The identified cases will be used for subsequent genetic case-control studies of the corresponding diseases. Such algorithms will enable the use of EMR clinical data directly for discovery research. In addition, we will develop treatment selection strategies for HIV infected patients using randomized ACTG clinical trials and for dietary intervention in preventing CVD using WHI clinical trials. Incorporating genetic profile, modifiable risk factors, along with biologic markers into risk models is likely to improve the prediction of clinical outcomes and ultimately lead to personalized medicine.

Public Health Relevance

The research proposal addresses the pressing need for advanced statistical tools that meet challenges in current development of prediction models for disease risk and treatment benefit. By providing statistical tools that enable clinical investigators to effectively develop personalized disease management strategies, this proposal will join prior and ongoing research activities towards the goal of finding efficient and cost effective personalized medicine.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM079330-07
Application #
8719125
Study Section
Biomedical Computing and Health Informatics Study Section (BCHI)
Program Officer
Marcus, Stephen
Project Start
2007-06-01
Project End
2015-06-30
Budget Start
2014-07-01
Budget End
2015-06-30
Support Year
7
Fiscal Year
2014
Total Cost
Indirect Cost
Name
Harvard University
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
City
Boston
State
MA
Country
United States
Zip Code
02115
Liu, Dandan; Cai, Tianxi; Lok, Anna et al. (2018) Nonparametric Maximum Likelihood Estimators of Time-Dependent Accuracy Measures for Survival Outcome Under Two-Stage Sampling Designs. J Am Stat Assoc 113:882-892
Xia, Yin; Cai, Tianxi; Cai, T Tony (2018) Multiple Testing of Submatrices of a Precision Matrix with Applications to Identification of Between Pathway Interactions. J Am Stat Assoc 113:328-339
Sinnott, Jennifer A; Cai, Tianxi (2018) Pathway aggregation for survival prediction via multiple kernel learning. Stat Med 37:2501-2515
Zheng, Yingye; Brown, Marshall; Lok, Anna et al. (2017) IMPROVING EFFICIENCY IN BIOMARKER INCREMENTAL VALUE EVALUATION UNDER TWO-PHASE DESIGNS. Ann Appl Stat 11:638-654
Maziarz, Marlena; Heagerty, Patrick; Cai, Tianxi et al. (2017) On longitudinal prediction with time-to-event outcome: Comparison of modeling options. Biometrics 73:83-93
Zhou, Qian M; Dai, Wei; Zheng, Yingye et al. (2017) Robust Dynamic Risk Prediction with Longitudinal Studies. Stat Theory Relat Fields 1:159-170
Cai, Tianxi; Cai, T Tony; Zhang, Anru (2016) Structured Matrix Completion with Applications to Genomic Data Integration. J Am Stat Assoc 111:621-633
Zhao, Lihui; Claggett, Brian; Tian, Lu et al. (2016) On the restricted mean survival time curve in survival analysis. Biometrics 72:215-21
Shen, Yuanyuan; Cai, Tianxi (2016) Identifying predictive markers for personalized treatment selection. Biometrics 72:1017-1025
Sinnott, Jennifer A; Cai, Tianxi (2016) Inference for survival prediction under the regularized Cox model. Biostatistics 17:692-707

Showing the most recent 10 out of 58 publications