In order to understand the genetic basis of human diseases, over 1,800 genome-wide association studies (GWAS) have been conducted to identify genetic variants associated with common diseases and disease- related traits. However, since ~93% of GWAS variants are found in noncoding regions of the human genome, pinpointing causal variants or even genes can be a difficult task. Often, even the important pathways, tissues, and cell types are unknown, making it even more challenging to predict the functional consequences of identified variants and to set up appropriate experimental systems for validating these predictions. In an effort to deal with these challenges, our lab has recently developed an approach, DEPICT (Data-driven Expression- Prioritized Integration for Complex Traits), that combines data from gene expression, protein-protein interactions, mouse knockout phenotypes, and pathways/gene sets to prioritize important genes, pathways, and tissues/cell types from GWAS results for any disease or trait. While DEPICT performs better than several existing methods that only consider single data types, it does not yet include any epigenetic information. The recent generation of epigenomic maps in many human cell types facilitates the annotation of the noncoding human genome and can be very valuable for GWAS analysis. Therefore, in this project, we will utilize epigenetic information to complement and improve existing GWAS prioritization approaches such as DEPICT. Specifically, using regulatory element annotations and RNA sequencing datasets from the Roadmap Epigenomics project, we will both develop a new integrative method and also modify the original DEPICT implementation to include epigenetic data. We will validate and test both methods on GWAS data from the Genetic Investigation of Anthropometric Traits (GIANT) consortium, which include association results for height, body mass index (BMI), and waist-to-hip-ratio adjusted for BMI. Successful completion of this project will provide an innovative and powerful tool that incorporates epigenomic with other data types to prioritize tissues/cell types, genes, TF motifs, and pathways from GWAS results. The tool will be useful for studying a wide variety of complex traits and diseases and can help to fulfill the potential of GWAS to infer biology and advance biomedicine.

Public Health Relevance

In this project, we will implement and improve computational methods to prioritize important genes, pathways, and cell types for human diseases and traits that have been investigated using a genetic approach called genome-wide association studies. We will use new datasets, from the Roadmap Epigenomics project, that identify specific parts of the genome as important for turning nearby genes on and off. Successful completion of this project will provide an innovative and powerful tool that incorporates epigenomic data with other types of data and that will be useful for studying a wide variety of human traits and diseases. This tool can help to fulfill the potential of genetic studies to infer biology of disease and therby advance biomedicine.

Agency
National Institute of Health (NIH)
Institute
National Heart, Lung, and Blood Institute (NHLBI)
Type
Predoctoral Individual National Research Service Award (F31)
Project #
5F31HL126581-03
Application #
9211360
Study Section
Special Emphasis Panel (ZRG1-F08-B (20)L)
Program Officer
Huang, Li-Shin
Project Start
2015-02-01
Project End
2018-01-31
Budget Start
2017-02-01
Budget End
2018-01-31
Support Year
3
Fiscal Year
2017
Total Cost
$31,113
Indirect Cost
Name
Harvard Medical School
Department
Biology
Type
Schools of Medicine
DUNS #
047006379
City
Boston
State
MA
Country
United States
Zip Code
02115
Esko, Tõnu; Hirschhorn, Joel N; Feldman, Henry A et al. (2017) Metabolomic profiles as reliable biomarkers of dietary composition. Am J Clin Nutr 105:547-554