Susceptibility to complex diseases is determined by the coordinated function of multiple genetic variants and environmental factors interacting in a composite and potentially nonlinear manner. Although the genome-wide views provided by advanced single nucleotide polymorphism (SNP) arrays present an opportunity to discover previously unrecognized genomic patterns, the ability to recognize such complex features of genetic architecture has important implications for the use of genome-wide association study to discover genetic determinants of health and disease from massive genomic data. The focus of this application is the development and validation of effective computational statistics approaches to detect complex interaction effects of multi-locus SNPs which could be useful for classification and prediction of disease or dysfunction, and provide novel insights into the pathogenesis of complex phenotypes. Based on very promising preliminary studies, the specific aims of this R21 application are carefully designed: (1) to refine and evaluate the recently developed significant conditional association (SCA) criterion and heuristic combinatorial interaction growing (HCIG) search strategy (specifically designed to discover complex interaction effects of multi-locus jointly-predictive SNPs), test and validate on real SNP based realistic simulations and compare with a panel of most relevant existing methods;and (2) to apply SCA- HCIG method to the real SNP data of NIAMS-funded FMS cohort in relationship to metabolic syndrome etc. and develop SNP marker based classification/ prediction models, assessed by the prediction power and initial biological plausibility of the implicated SNP subsets. This proposal represents a unique cross-disciplinary collaboration focusing on the development of new analytical methods to more effectively identify interacting susceptibility SNPs and environmental factors that can be used to determine individual risk to a specific disease and to estimate prognosis and response to treatment. The results could also suggest novel preventive intervetions and therapeutic targets, reduce the burden of diseases, and accelerate the realization of truly personalized medicine.

Public Health Relevance

Wang, Yue (Joseph), Ph.D. Machine learning to identify predictive SNPs and complex interaction effects Project Narrative Susceptibility to complex diseases is determined by the coordinated function of multiple genetic and environmental factors. The identified interacting susceptibility SNPs and risk factors can be used to determine individual risk to a specific disease and to estimate prognosis and response to treatment. The results could also suggest novel preventive intervetions and therapeutic targets, reduce the burden of diseases, and accelerate the realization of truly personalized medicine. PHS 398/2590 (Rev. 09/04, Reissued 4/2006) Page 0 Continuation Format Page

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Exploratory/Developmental Grants (R21)
Project #
5R21GM085665-02
Application #
7922025
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Lyster, Peter
Project Start
2009-09-01
Project End
2012-08-31
Budget Start
2010-09-01
Budget End
2012-08-31
Support Year
2
Fiscal Year
2010
Total Cost
$226,422
Indirect Cost
Name
Virginia Polytechnic Institute and State University
Department
Engineering (All Types)
Type
Schools of Engineering
DUNS #
003137015
City
Blacksburg
State
VA
Country
United States
Zip Code
24061
Yuan, Xiguo; Yu, Guoqiang; Hou, Xuchu et al. (2012) Genome-wide identification of significant aberrations in cancer genome. BMC Genomics 13:342
de Assis, Sonia; Warri, Anni; Cruz, M Idalia et al. (2012) High-fat or ethinyl-oestradiol intake during pregnancy increases mammary cancer risk in several generations of offspring. Nat Commun 3:1053
Yuan, Xiguo; Miller, David J; Zhang, Junying et al. (2012) An overview of population genetic data simulation. J Comput Biol 19:42-54
Yuan, Xiguo; Zhang, Junying; Zhang, Shengli et al. (2012) Comparative analysis of methods for identifying recurrent copy number alterations in cancer. PLoS One 7:e52516
Yuan, Xiguo; Zhang, Junying; Yang, Liying et al. (2012) TAGCNA: a method to identify significant consensus events of copy number alterations in cancer. PLoS One 7:e41082
Zhang, Bai; Tian, Ye; Jin, Lu et al. (2011) DDN: a caBIGĀ® analytical tool for differential network analysis. Bioinformatics 27:1036-8
Chen, Li; Yu, Guoqiang; Langefeld, Carl D et al. (2011) Comparative analysis of methods for detecting interacting loci. BMC Genomics 12:344
Yuan, Xiguo; Zhang, Junying; Wang, Yue (2011) Simulating linkage disequilibrium structures in a human population for SNP association studies. Biochem Genet 49:395-409
Yuan, Xiguo; Zhang, Junying; Wang, Yue (2010) Probability theory-based SNP association study method for identifying susceptibility loci and genetic disease models in human case-control data. IEEE Trans Nanobioscience 9:232-41
Miller, David J; Zhang, Yanxin; Yu, Guoqiang et al. (2009) An algorithm for learning maximum entropy probability models of disease risk that efficiently searches and sparingly encodes multilocus genomic interactions. Bioinformatics 25:2478-85

Showing the most recent 10 out of 11 publications