Many common diseases arise from complex interactions among multiple genetic and environmental factors. Genome Wide Association Studies (GWAS) comprehensively compare common genetic variants in affected and control populations to identify variants that are potentially associated with disease. In recent years, GWAS successfully identified susceptible genes for many diseases. However, researchers recognize many limitations of GWAS in characterizing the genetic bases of complex diseases, including reduced statistical power due to small sample size, inadequacy of separate consideration of individual variants in capturing the interplay between multiple factors, modest success in predicting individual risk for disease, and lack of insights into the biological and functional mechanisms that relate identified variants to the disease. This project aims to enhance GWAS by using protein-protein interaction (PPI) networks as an integrative framework to interpret the outcome of GWAS within a functional context. PPI networks characterize the physical and functional interactions among functional proteins;thus they are useful in understanding the functional relationships between multiple genetic factors. This project will facilitate effective use of PPI networks to identify the functional relationships among genetic factors implicated in GWAS by developing efficient computational algorithms that will integrate multiple sources of """"""""omic"""""""" data. In particular, to enhance the relatively weak association signals captured by GWAS, we will combine association scores of individual proteins to identify interacting groups of proteins that exhibit a stronger association signal when considered together. We will also search for combinations of multiple genetic factors that are associated with disease by confining the search space to known physical and functional interactions. We will also score identified groups of proteins in terms of their collective differential expression in the disease, with a view to gaining insights into the relationship between genetic differences and dysregulation of gene expression. We will extensively test the proposed algorithms on a variety of diseases, using large case-control datasets obtained from public databases (Wellcome Trust Case-Control Consortium and The database of Genotypes and Phenotypes), as well as our collaborators. In particular, we will extend our existing collaborations with the Candidate Gene Association Resource (CARe) project that includes 40,000 individuals and validate our algorithm development through functional gene association with cardiovascular phenotypes of importance in the CARe project. This research will result in novel computational tools that will reliably connect genomic data to function and disease phenotypes to drive focused and effective mechanistic studies of complex diseases (including clinical studies and studies in model organisms) by our collaborators and the wider biomedical science community, ultimately providing diagnostic and prognostic biomarkers and mechanistic insight to inform clinical studies more comprehensively and effectively than existing GWAS.

Public Health Relevance

Genome-wide association studies (GWAS) aim to uncover the genetic bases of diseases by comprehensively screening the human genome to discover genetic variants that are associated with the disease. This project aims to enhance these studies by developing computational algorithms that will facilitate interpretation of such genetic variants in the context of their biological function and relationship to each other. These algorithms will be tested on a wide variety of diseases and also applied obstructive sleep apnea and related diseases to shed light into the complex relationships between genetic factors that contribute to these diseases.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
1R01LM011247-01A1
Application #
8373161
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
2012-08-01
Project End
2016-07-31
Budget Start
2012-08-01
Budget End
2013-07-31
Support Year
1
Fiscal Year
2012
Total Cost
$363,000
Indirect Cost
$115,443
Name
Case Western Reserve University
Department
Engineering (All Types)
Type
Schools of Engineering
DUNS #
077758407
City
Cleveland
State
OH
Country
United States
Zip Code
44106
Maxwell, Sean; Chance, Mark R; Koyutürk, Mehmet (2017) Linearity of network proximity measures: implications for set-based queries and significance testing. Bioinformatics 33:1354-1361
Whiting, Kathleen; Liu, Larry Y; Koyutürk, Mehmet et al. (2017) NETWORK MAP OF ADVERSE HEALTH EFFECTS AMONG VICTIMS OF INTIMATE PARTNER VIOLENCE. Pac Symp Biocomput 22:324-335
Savel, Daniel; LaFramboise, Thomas; Grama, Ananth et al. (2017) Pluribus-Exploring the Limits of Error Correction Using a Suffix Tree. IEEE/ACM Trans Comput Biol Bioinform 14:1378-1388
Stanfield, Zachary; Co?kun, Mustafa; Koyutürk, Mehmet (2017) Drug Response Prediction as a Link Prediction Problem. Sci Rep 7:40321
Cowman, Tyler; Koyutürk, Mehmet (2017) Prioritizing tests of epistasis through hierarchical representation of genomic redundancies. Nucleic Acids Res 45:e131
Gill, Mandev S; Tung Ho, Lam Si; Baele, Guy et al. (2017) A Relaxed Directional Random Walk Model for Phylogenetic Trait Evolution. Syst Biol 66:299-319
Maxwell, Sean; Chance, Mark R; Koyutürk, Mehmet (2016) Linearity of Network Proximity Measures: Implications for Set-based Queries and Significance Testing. Bioinformatics :
Brubaker, Douglas; Liu, Yu; Wang, Junye et al. (2016) Finding lost genes in GWAS via integrative-omics analysis reveals novel sub-networks associated with preterm birth. Hum Mol Genet 25:5254-5264
Ni, Jingchao; Koyuturk, Mehmet; Tong, Hanghang et al. (2016) Disease gene prioritization by integrating tissue-specific molecular networks using a robust multi-network model. BMC Bioinformatics 17:453
Ayati, Marzieh; Koyutürk, Mehmet (2016) PoCos: Population Covering Locus Sets for Risk Assessment in Complex Diseases. PLoS Comput Biol 12:e1005195

Showing the most recent 10 out of 24 publications