The focus of this application is the development and validation of new computational approaches to identify complex interactions among genetic and environmental factors (features) which could be used to help identify individuals at high risk for a specific disease or dysfunction, and provide novel insights into the pathophysiology of the conditions in question.
Specific Aims of the application include: 1 )To adapt a variety of statistical machine learning methods to the analysis of simulated high density genome scan and environmental exposure data and to evaluate their ability to identify SNPs and environmental factors that are jointly predictive of a binary trait;2)To apply the described feature selection and model building techniques to the genome-wide SNP genotype data collected from two NHLBI-funded genome-wide association studies: a) the SNPs and Atherosclerosis (SEA) study predicting premature atherosclerosis, and b) the Cholesterol and Pharmacogenetics of Statins (CAPS) Study predicting LDL cholesterol;3) to develop a study-specific publicly accessible web-site designed to help disseminate the methods and results of the project and 4) to support the NIH-wide Genes and Environment Initiative (GEI). This proposal represents a unique collaboration focusing on the development of new methods to more effectively identify interacting genetic and environmental factors that account for variation in risk for common cardiovascular and other disease phenotypes. If the risk is determined, in part by a gene-environment interaction, the preventive intervention could include altering the environmental exposure. Furthermore, determining specific genetic and/or environmental factors that jointly influence risk may reveal new biologic pathways that would be appropriate targets for novel therapeutic interventions. Together, improved risk stratification and new pathophysiologic insights would be expected to reduce the burden of disease and accelerate the realization of true personalized medicine. Relevance of this research to public health: This project aims to develop new approaches to identify the relationship between genetic and environmental factors which could then be used to identify people at high risk for a disease. Determining specific genetic and/or environmental factors that influence a person's risk of disease may help doctors reduce risk for disease and reveal new treatments for disease.
Yuan, Xiguo; Miller, David J; Zhang, Junying et al. (2012) An overview of population genetic data simulation. J Comput Biol 19:42-54 |
Guy, Richard T; Santago, Peter; Langefeld, Carl D (2012) Bootstrap aggregating of alternating decision trees to detect sets of SNPs that associate with disease. Genet Epidemiol 36:99-106 |
Chen, Li; Yu, Guoqiang; Langefeld, Carl D et al. (2011) Comparative analysis of methods for detecting interacting loci. BMC Genomics 12:344 |
Yuan, Xiguo; Zhang, Junying; Wang, Yue (2011) Simulating linkage disequilibrium structures in a human population for SNP association studies. Biochem Genet 49:395-409 |
Yuan, Xiguo; Zhang, Junying; Wang, Yue (2010) Probability theory-based SNP association study method for identifying susceptibility loci and genetic disease models in human case-control data. IEEE Trans Nanobioscience 9:232-41 |