Over the past decade, data generated from genome-wide association studies (GWAS) have been used to analyze heritability of human diseases and complex traits to improve understanding of genetic architecture, including identifying the genetic variants associated with diseases and their effect sizes. These heritability enrichment analyses have revealed underlying biological mechanisms such as regulatory elements involved in disease pathways. This research proposes statistical methods and analysis to formally assess the contribution of biological pathways, gene networks, and re-prioritized functional annotations to the genetic architecture of 94 human diseases and complex traits. Better understanding of disease architecture will improve genome-wide associations and fine mapping and provides robust inference of disease etiology and mechanisms.
The first aim i s to determine the contribution of disease-associated gene pathways to disease heritability. It is well known that an abnormality in individual genes rarely causes a human disease. Large-scale pathway analysis will elucidate disease pathways by constructing functional annotations and applying an existing polygenic method to partition heritability. Furthermore, pathway analyses on candidate genes identified via whole-exome sequencing (WES) data will inform new biological inference. The resulting set of enriched pathway-trait pairs will further help with independent validation of previously reported associations and for novel associations enriched for trait heritability.
The second aim i s to determine the contribution of gene network to disease heritability across 94 diseases, focusing on hub genes defined by a broad set of network connectivity metrics. The contribution of gene network to disease heritability has been only partially answered, limited to few network connectivity metrics across 42 diseases. Comprehensive analysis of gene sets informed by published gene networks will inform a new direction of integrating gene networks to infer biological mechanisms of human diseases. In addition, visualization method for large-scale gene networks will greatly enhance research translation, serving as a useful resource for studying gene-gene interactions for potential drugs and therapeutic targets.
The third aim i s to de-noise and re-prioritize Mendelian annotations to infer candidate genes. Despite considerable process on prioritizing rare variants impacting Mendelian diseases, little is known about the utility of these Mendelian pathogenicity scores for common disease. I will conduct heritability enrichment analyses to answer informativeness of Mendelian disease pathogenicity scores for common disease. To further improve their informativeness, I propose a machine learning variant reprioritization framework. Ultimately, this project will produce (1) a detailed understanding of disease-associated biological pathways to disease heritability, (2) a set of functional annotations informed by gene networks significantly enriched for disease heritability, and (3) a new method to prioritize variants and identifying candidate genes based on diverse genomic features.

Public Health Relevance

Understanding human genetic variations, from single-nucleotide polymorphisms (SNPs) to mutations, offers a great potential in uncovering underlying disease mechanisms by identifying new disease-associated genes and in discovering potential drug targets and therapeutics. This proposed studies aim to understand the role of these genetic variations in trait heritability, leveraging disease-associated biological pathways, gene networks, and functional annotations. The developed method to prioritize variants and genes will answer the contribution of gene pathways, gene sets informed by gene networks, and Mendelian pathogenicity scores to disease heritability.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Predoctoral Individual National Research Service Award (F31)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Gatlin, Tina L
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Massachusetts Institute of Technology
Engineering (All Types)
Biomed Engr/Col Engr/Engr Sta
United States
Zip Code