While accurate annotations of protein-coding regions in the human genome have been available for many years, annotation and interpretation of regulatory sequences has lagged far behind. This is because?in contrast to protein-coding sequences?the ?rules? that govern links from genome sequence to regulatory function are fuzzy, complex, and highly context-specific. Our limited understanding of regulatory regions presents a fundamental challenge for the identification and interpretation of disease variation, especially in the context of personal genome interpretation. Work from ENCODE and other groups has started to close this gap through experimental work, including high-resolution maps of regulatory sites in a variety of cell types, and modeling of the cell-type specific mappings from genome sequence to regulatory function. In our funded project so far we have developed new computational tools to understand gene regulation, and how this may be impacted by genetic variation; as well as new methods for high throughput validation. In the Supplement year, we propose to extend this work with additional projects focusing in this area, including work on zinc finger proteins; connections between genetic variation, RNA expression, and GWAS; and, finally, high throughput CRISPR-based validation experiments.

Public Health Relevance

The purpose of this project is to develop powerful new computational methods to understand and predict the identity and function of gene regulatory sequences in diverse cell types. We will use these new methods to help us interpret common and rare genetic variation, and to identify variants that may contribute to disease. Outputs from the project will include new methods, software and functional validation data.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project--Cooperative Agreements (U01)
Project #
3U01HG009431-04S1
Application #
10241018
Study Section
Program Officer
Morris, Stephanie A
Project Start
2017-02-01
Project End
2022-01-31
Budget Start
2021-02-01
Budget End
2022-01-31
Support Year
4
Fiscal Year
2021
Total Cost
Indirect Cost
Name
Stanford University
Department
Genetics
Type
Schools of Medicine
DUNS #
009214214
City
Stanford
State
CA
Country
United States
Zip Code
94305
Liu, Boxiang; Pjanic, Milos; Wang, Ting et al. (2018) Genetic Regulatory Mechanisms of Smooth Muscle Cells Map to Coronary Artery Disease Risk Loci. Am J Hum Genet 103:377-388
Li, Yang I; Knowles, David A; Humphrey, Jack et al. (2018) Annotation-free quantification of RNA splicing using LeafCutter. Nat Genet 50:151-158
Yamamoto, Ryo; Wilkinson, Adam C; Ooehara, Jun et al. (2018) Large-Scale Clonal Analysis Resolves Aging of the Mouse Hematopoietic Stem Cell Compartment. Cell Stem Cell 22:600-607.e4
Knowles, David A; Burrows, Courtney K; Blischak, John D et al. (2018) Determining the genetic basis of anthracycline-cardiotoxicity by molecular response QTL mapping in induced cardiomyocytes. Elife 7:
Ursu, Oana; Boley, Nathan; Taranova, Maryna et al. (2018) GenomeDISCO: a concordance score for chromosome conformation capture experiments using random walks on contact map graphs. Bioinformatics 34:2701-2707
Harpak, Arbel; Lan, Xun; Gao, Ziyue et al. (2017) Frequent nonallelic gene conversion on the human lineage and its effect on the divergence of gene duplicates. Proc Natl Acad Sci U S A 114:12779-12784