Many common diseases arise from complex interactions among multiple genetic and environmental factors. Genome Wide Association Studies (GWAS) comprehensively compare common genetic variants in affected and control populations to identify variants that are potentially associated with disease. In recent years, GWAS successfully identified susceptible genes for many diseases. However, researchers recognize many limitations of GWAS in characterizing the genetic bases of complex diseases, including reduced statistical power due to small sample size, inadequacy of separate consideration of individual variants in capturing the interplay between multiple factors, modest success in predicting individual risk for disease, and lack of insights into the biological and functional mechanisms that relate identified variants to the disease. This project aims to enhance GWAS by using protein-protein interaction (PPI) networks as an integrative framework to interpret the outcome of GWAS within a functional context. PPI networks characterize the physical and functional interactions among functional proteins;thus they are useful in understanding the functional relationships between multiple genetic factors. This project will facilitate effective use of PPI networks to identify the functional relationships among genetic factors implicated in GWAS by developing efficient computational algorithms that will integrate multiple sources of "omic" data. In particular, to enhance the relatively weak association signals captured by GWAS, we will combine association scores of individual proteins to identify interacting groups of proteins that exhibit a stronger association signal when considered together. We will also search for combinations of multiple genetic factors that are associated with disease by confining the search space to known physical and functional interactions. We will also score identified groups of proteins in terms of their collective differential expression in the disease, with a view to gaining insights into the relationship between genetic differences and dysregulation of gene expression. We will extensively test the proposed algorithms on a variety of diseases, using large case-control datasets obtained from public databases (Wellcome Trust Case-Control Consortium and The database of Genotypes and Phenotypes), as well as our collaborators. In particular, we will extend our existing collaborations with the Candidate Gene Association Resource (CARe) project that includes 40,000 individuals and validate our algorithm development through functional gene association with cardiovascular phenotypes of importance in the CARe project. This research will result in novel computational tools that will reliably connect genomic data to function and disease phenotypes to drive focused and effective mechanistic studies of complex diseases (including clinical studies and studies in model organisms) by our collaborators and the wider biomedical science community, ultimately providing diagnostic and prognostic biomarkers and mechanistic insight to inform clinical studies more comprehensively and effectively than existing GWAS.

Public Health Relevance

Genome-wide association studies (GWAS) aim to uncover the genetic bases of diseases by comprehensively screening the human genome to discover genetic variants that are associated with the disease. This project aims to enhance these studies by developing computational algorithms that will facilitate interpretation of such genetic variants in the context of their biological function and relationship to each other. These algorithms will be tested on a wide variety of diseases and also applied obstructive sleep apnea and related diseases to shed light into the complex relationships between genetic factors that contribute to these diseases.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Case Western Reserve University
Engineering (All Types)
Schools of Engineering
United States
Zip Code
Liu, Yu; Koyut├╝rk, Mehmet; Maxwell, Sean et al. (2014) Discovery of common sequences absent in the human reference genome using pooled samples from next generation sequencing. BMC Genomics 15:685
Liu, Yu; Chance, Mark R (2013) Pathway analyses and understanding disease associations. Curr Genet Med Rep 1: