Despite the successes of genome-wide association studies (GWAS), important challenges remain that still limit their impact on human biology and medicine, especially for non-coding variants which remain poorly understood. In this proposal, we exploit recent advances (many pioneered by our group) to overcome these challenges and gain a systematic understanding of the role of non-coding variants in human disease and complex traits. First, we develop new statistical methods that utilize high-resolution regulatory annotations to predict disease-relevant tissues, chromatin states, and regulatory motifs, and to prioritize non-coding variants more likely to have regulatory effects within regions of genetic association using epigenomic state information, comparative genomic information, and regulatory motif analysis (Aim 1). Second, we develop a new Bayesian methods for linking regulatory regions to their upstream regulators and downstream target genes by integrating genetic information across all associated regions in the context of regulatory networks that link regulators and regulatory regions using their correlated activity, regulatory motifs, and expression quantitative trait locus (eQTL) information (Aim 2). Third, we validate our methods and predictions using massively-parallel enhancer assays to test the effect of large number of regulatory variants in isolation; using genome editing technologies to test the effects of regulatory variants in their endogenous context; and using cellular phenotypes and animal models to test the physiological effects of regulatory variants at the cellular and organismal levels (Aim 3), and use the results to refine our computational methods and models. Even though our experimental validations are only performed for a small number of traits and cell types that are amenable to such studies, our methods are general and will be applied to all genetic studies available through ongoing collaborations and public catalogs.

Public Health Relevance

Most human variants associated with disease are non-coding and largely uncharacterized, making it a great priority to understand the cell types in which they act and their mechanism of action. To address this challenge, we propose to develop methods to systematically study the regulatory impact of genetic variation by integrating genetic information from genome-wide association studies with genome annotations of regulatory elements across diverse tissues and cell types, regulatory motifs, and cellular circuits. We will systematically validate our predictions using next-generation tools for massively parallel assays and genome editing, in order to test the effect of non-coding variants on regulatory activity, gene expression, and molecular phenotypes associated with diabetes, heart disease, cancer, and neuropsychiatric disease.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
7R01HG008155-04
Application #
9616350
Study Section
Special Emphasis Panel (ZHG1)
Program Officer
Brooks, Lisa
Project Start
2017-11-01
Project End
2019-07-31
Budget Start
2017-11-01
Budget End
2019-07-31
Support Year
4
Fiscal Year
2017
Total Cost
Indirect Cost
Name
Massachusetts Institute of Technology
Department
Type
DUNS #
001425594
City
Cambridge
State
MA
Country
United States
Zip Code
02142
Onuchic, Vitor; Lurie, Eugene; Carrero, Ivenise et al. (2018) Allele-specific epigenome maps reveal sequence-dependent stochastic switching at regulatory loci. Science 361:
Wang, Yang; Li, Yue; Yue, Minghui et al. (2018) N6-methyladenosine RNA modification regulates embryonic neural stem cell self-renewal through histone modifications. Nat Neurosci 21:195-206
Ernst, Jason; Melnikov, Alexandre; Zhang, Xiaolan et al. (2016) Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions. Nat Biotechnol 34:1180-1190
Smith, J Gustav; Felix, Janine F; Morrison, Alanna C et al. (2016) Discovery of Genetic Variation on Chromosome 5q22 Associated with Mortality in Heart Failure. PLoS Genet 12:e1006034
Wang, Xinchen; Tucker, Nathan R; Rizki, Gizem et al. (2016) Discovery and validation of sub-threshold genome-wide association study loci using epigenomic signatures. Elife 5:
Marbach, Daniel; Lamparter, David; Quon, Gerald et al. (2016) Tissue-specific regulatory circuits reveal variable modular perturbations across complex diseases. Nat Methods 13:366-70
Ward, Lucas D; Kellis, Manolis (2016) HaploReg v4: systematic mining of putative causal variants, cell types, regulators and target genes for human complex traits and disease. Nucleic Acids Res 44:D877-81
Barrera, Luis A; Vedenko, Anastasia; Kurland, Jesse V et al. (2016) Survey of variation in human transcription factors reveals prevalent DNA binding changes. Science 351:1450-1454
Bekelis, Kimon; Kerley-Hamilton, Joanna S; Teegarden, Amy et al. (2016) MicroRNA and gene expression changes in unruptured human cerebral aneurysms. J Neurosurg 125:1390-1399
Claussnitzer, Melina; Dankel, Simon N; Kim, Kyoung-Han et al. (2015) FTO Obesity Variant Circuitry and Adipocyte Browning in Humans. N Engl J Med 373:895-907