Integrated Genome Discovery at Single Base Pair Resolution

Gifford, David

Abstract

We propose to produce computationally predicted and experimentally improved single-base-pair resolution maps of genome regulatory elements and their higher-level architectures with ENCODE consortium data. To accomplish this goal, we will accomplish four Aims:
Aim 1 will discover genome regulatory elements at single base pair resolution by simultaneously modeling ChIP-seq data, DNase-seq data, and genome sequence to discover where regulators bind to the genome along with explanatory DNA sequence motifs;
Aim 2 will use integrative analysis to learn probabilistic models of enhancer grammars that include symbol spacing models;
Aim 3 will develop active learning methods to precisely design synthetic enhancer sequences to construct Enhancer Grammar Activity Models (EGAMs) that explain the consequences of different forms of enhancer grammar on gene regulation, and will also learn regulatory factors that are associated with unlinked motifs;
Aim 4 will discover regulatory networks that describe how chromatin and gene expression state is established based on regulator activity, and relate human disease associated genomic variation to potential disease mechanisms. The results of our Aims will be validated with both experimental and computational studies.

Public Health Relevance

We will develop and use new methods to understand the language of the genome - the words and sentences of symbols that describe how cells function both in health and disease. Because the language is complicated, we will use new experimental methods to write and test thousands of genomic sentences for function in a dish. Our ultimate goal is to improve human health by understanding how disease related changes in our genome cause things to go wrong.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Project--Cooperative Agreements (U01)
Project #: 5U01HG007037-02
Application #: 8546274
Study Section: Special Emphasis Panel (ZHG1-HGR-M (M2))
Program Officer: Pazin, Michael J

Project Start: 2012-09-17
Project End: 2015-06-30
Budget Start: 2013-07-01
Budget End: 2014-06-30
Support Year: 2
Fiscal Year: 2013
Total Cost: $609,880
Indirect Cost: $189,106

Institution

Name: Massachusetts Institute of Technology
Department
Type: Organized Research Units
DUNS #: 001425594

City: Cambridge
State: MA
Country: United States
Zip Code: 02139

Related projects


NIH 2015 U01 HG	Integrated Genome Discovery at Single Base Pair Resolution Gifford, David K. / Massachusetts Institute of Technology	$466,827
NIH 2014 U01 HG	Integrated Genome Discovery at Single Base Pair Resolution Gifford, David K. / Massachusetts Institute of Technology
NIH 2013 U01 HG	Integrated Genome Discovery at Single Base Pair Resolution Gifford, David K. / Massachusetts Institute of Technology	$609,880
NIH 2012 U01 HG	Integrated Genome Discovery at Single Base Pair Resolution Gifford, David K. / Massachusetts Institute of Technology	$483,798

Publications

Guo, Yuchun; Tian, Kevin; Zeng, Haoyang et al. (2018) A novel k-mer set memory (KSM) motif representation improves regulatory variant prediction. Genome Res 28:891-900

Guo, Yuchun; Gifford, David K (2017) Modular combinatorial binding among human trans-acting factors reveals direct and indirect factor binding. BMC Genomics 18:45

Zeng, Haoyang; Edwards, Matthew D; Guo, Yuchun et al. (2017) Accurate eQTL prioritization with an ensemble-based framework. Hum Mutat 38:1259-1265

Rajagopal, Nisha; Srinivasan, Sharanya; Kooshesh, Kameron et al. (2016) High-throughput mapping of regulatory DNA. Nat Biotechnol 34:167-74

Gymrek, Melissa; Willems, Thomas; Guilmatre, Audrey et al. (2016) Abundant contribution of short tandem repeats to gene expression variation in humans. Nat Genet 48:22-9

Zeng, Haoyang; Edwards, Matthew D; Liu, Ge et al. (2016) Convolutional neural network architectures for predicting DNA-protein binding. Bioinformatics 32:i121-i127

Arbab, Mandana; Sherwood, Richard I (2016) Self-Cloning CRISPR. Curr Protoc Stem Cell Biol 38:5B.5.1-5B.5.16

Hashimoto, Tatsunori; Sherwood, Richard I; Kang, Daniel D et al. (2016) A synergistic DNA logic predicts genome-wide chromatin accessibility. Genome Res 26:1430-1440

Zeng, Haoyang; Hashimoto, Tatsunori; Kang, Daniel D et al. (2016) GERV: a statistical method for generative evaluation of regulatory variants for transcription factor binding. Bioinformatics 32:490-6

Barkal, Amira A; Srinivasan, Sharanya; Hashimoto, Tatsunori et al. (2016) Cas9 Functionally Opens Chromatin. PLoS One 11:e0152683

Showing the most recent 10 out of 18 publications

Comments

Be the first to comment on David Gifford's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: