Time to onset of chronic diseases such as cancer, cardiovascular disease, and diabetes is expected to be influenced by multiple gene-gene interactions that add to the complexity of the genotype-phenotype mapping relationship. Unfortunately, parametric statistical methods such as Cox regression lack sufficient power to detect high-order gene-gene interactions due to the sparseness of the data. Machine learning methods offer a more powerful alternative but rely on computationally-intensive searching methods to identify the top models. We propose here to develop a powerful and computationally efficient bioinformatics strategy that combines machine learning algorithm and Cox regression for identifying gene-gene and gene-environment interaction models that are associated with time of onset of chronic disease. Specifically, we firs propose to develop a novel Robust Survival Multifactor Dimensionality Reduction method (RS-MDR) for the detection of gene-gene interactions in rare variants that influence time of onset of human disease (AIM 1). The power of RS-MDR method will be evaluated by comparing it to other existing methods in simulation studies. We then propose to change the representation space of the gene-gene interaction models using RS-MDR's construction induction method and apply L1 penalized Cox regression to identify a set of interaction models that can predict patients' survival probability (AIM 2). We hypothesize that RS-MDR can effectively identify high order interaction models and the combined approach provides a powerful and computational efficient way to select a set of interaction models. We will use extensive simulations that are derived from GWAS studies to thoroughly evaluate this hypothesis. Next, we will apply the new combined method for detecting and characterizing gene-gene and gene-environment interactions in genome-wide association study (GWAS) data from large population-based studies of lung cancer and rheumatoid arthritis (AIM 3). Results from the real data analysis will be used to refine the method. Finally, we will distribute the proposed method as part of an open- source R software package (AIM 4). We anticipate that the proposed method will combine the strength from both parametric and non-parametric methods and enable detection of interaction models that are jointly affecting time of onset of chronic diseases. This is important because time of onset has more variation than case-control status and it may be more clinically relevant. Furthermore, studies of genetic factors predicting time of onset have not been pursued aggressively using GWAS studies, despite the relevance of this information for the discovery of high risk variants like mutations in BRCA1.

Public Health Relevance

This project will provide a general method for genome data analysis that can be potentially apply to any specific disease. We focus the application of this method on an investigation into the role of genetic polymorphisms in lung cancer and rheumatoid arthritis. This will increase our knowledge of the basic biology and lead for screening and targeted therapy.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
1R01LM012012-01A1
Application #
8962812
Study Section
Special Emphasis Panel (ZLM1)
Program Officer
Ye, Jane
Project Start
2015-07-01
Project End
2018-06-30
Budget Start
2015-07-01
Budget End
2016-06-30
Support Year
1
Fiscal Year
2015
Total Cost
Indirect Cost
Name
Dartmouth College
Department
Family Medicine
Type
Schools of Medicine
DUNS #
041027822
City
Hanover
State
NH
Country
United States
Zip Code
Ji, Xuemei; Bossé, Yohan; Landi, Maria Teresa et al. (2018) Identification of susceptibility pathways for the role of chromosome 15q25.1 in modifying lung cancer risk. Nat Commun 9:3221
Gauderman, W James; Mukherjee, Bhramar; Aschard, Hugues et al. (2017) Update on the State of the Science for Analytical Methods for Gene-Environment Interactions. Am J Epidemiol 186:762-770
Demidenko, Eugene (2017) Exact and Approximate Statistical Inference for Nonlinear Regression and the Estimating Equation Approach. Scand Stat Theory Appl 44:636-665
Anderson, Allison P; Babu, Gautam; Swan, Jacob G et al. (2017) Ocular changes over 60 min in supine and prone postures. J Appl Physiol (1985) 123:415-423
Rodgers, Kyla R; Gui, Jiang; Dinulos, Mary Beth P et al. (2017) Ehlers-Danlos syndrome hypermobility type is associated with rheumatic diseases. Sci Rep 7:39636
Moyer, Benjamin J; Rojas, Itzel Y; Kerley-Hamilton, Joanna S et al. (2017) Obesity and fatty liver are prevented by inhibition of the aryl hydrocarbon receptor in both female and male mice. Nutr Res 44:38-50
Lee, Jai Woo; Punshon, Tracy; Moen, Erika L et al. (2017) Penalized estimation of sparse concentration matrices based on prior knowledge with applications to placenta elemental data. Comput Biol Chem 71:219-223
Yan, Shaofeng; Holderness, Britt M; Li, Zhongze et al. (2016) Epithelial-Mesenchymal Expression Phenotype of Primary Melanoma and Matched Metastases and Relationship with Overall Survival. Anticancer Res 36:6449-6456
Chou, Richard C; Kane, Michael; Ghimire, Sanjay et al. (2016) Treatment for Rheumatoid Arthritis and Risk of Alzheimer's Disease: A Nested Case-Control Analysis. CNS Drugs 30:1111-1120
Maro, Isaac I; Fellows, Abigail M; Clavier, Odile H et al. (2016) Auditory Impairments in HIV-Infected Children. Ear Hear 37:443-51

Showing the most recent 10 out of 11 publications