Nonparametric Variable Selection and Dimension Reduction for Predictive Models of Clinical Response in Pharmacogenomics Research Whole genome gene expression information have been used in pharmacogenomics research to correlate patients' gene expression profiles with a drug's efficacy. For many complex diseases, e.g., cancers, it is anticipated that gene expression profiles will provide predictive models, more precise than those based on standard clinical features, to define patient-specific treatment strategies. However, finding gene expression variations that affect drug response is complicated and challenging. Computational difficulties include that the whole genome gene expression data are high dimensional and their relationships to drug response would be nonlinear. Therefore, one can no longer rely on existing statistical and computational methods to adequately analyze the data. The long-term objective of the proposed project is to develop statistical and computational methods (for analysis of high dimensional but low sample size data and apply the methods in pharmacogenomics research. The short-term objective is to specifically develop nonparametric variable selection and dimension reduction techniques for predictive models of clinical response on gene expression data.) Three specific aims will be pursued: 1) Develop nonparametric vari- able selection approaches using LOESS (locally weighted scatterplot smoothing), which does not assume linear or any other specific forms of predictive models for clinical response; 2) Ex- tend Sliced Inverse Regression (SIR) to dimension reduction problems when the dimension is much larger than the sample size, as the case in pharmacogenomics; 3) Apply the proposed methods in pharmacogenomics (studies, whose data are available in Gene Expression Omnibus (GEO) DataSets, ://www.ncbi.nlm.nih.gov/gds . The proposed variable selection and dimension reduction methods are general to other regression problems, when the regression functions do not have specific forms and the data are big in terms of very high dimensional predictors but relatively low sample size.) Software to implement analysis will use the statistical package R language and will be fully documented for easy use by the biomedical research community.

Public Health Relevance

The proposed research will provide statistical methods and computational tools in analysis of large scale pharmacogenomics data. The project focuses on finding genomic variations that affect drug response and will use the information to develop predictive models for clinical response. It will contribute to the important public health endeavor to design patient treatment strategies based on each individual's unique genetic makeup.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Exploratory/Developmental Grants (R21)
Project #
5R21GM101504-02
Application #
8789367
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Friedman, Fred K
Project Start
2014-01-06
Project End
2015-12-31
Budget Start
2015-01-01
Budget End
2015-12-31
Support Year
2
Fiscal Year
2015
Total Cost
$133,134
Indirect Cost
$43,134
Name
Purdue University
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
072051394
City
West Lafayette
State
IN
Country
United States
Zip Code
47907
Liu, Yaowu; Xie, Jun (2018) POWERFUL TEST BASED ON CONDITIONAL EFFECTS FOR GENOME-WIDE SCREENING. Ann Appl Stat 12:567-585
Chen, Donglai; Liu, Chuanhai; Xie, Jun (2016) Multi-locus Test and Correction for Confounding Effects in Genome-Wide Association Studies. Int J Biostat 12:
Zhu, Jingyi; Xie, Jun (2015) Nonparametric Variable Selection for Predictive Models and Subpopulations in Clinical Trials. J Biopharm Stat 25:781-94