The goal of the ENCODE Project is to provide the biomedical community with a complete and biologically interpretable annotation of the human genome. This means discovering and mapping all parts of all genes, including exons, introns, promoters and cis-regulatory sequences, in previous phases of the ENCODE Project, the applicants of this proposal developed and applied robust, high-throughput, genome-wide methods for determining transcription factor occupancy, assessing DNA methylation, identifying RNA transcripts, and experimentally testing candidate regulatory elements and mutations. The combination of experiences from the previous phases with the resulting technology and analysis platforms and the existing, highly productive infrastructure of the applicants form the basis of this response to NHGRI's RFA-HG-11-024 ("Expanding the Encyclopedia of DNA Elements (ENCODE) in the Human and Model Organisms"). This application presents an ambitious proposal to expand the biological dimensions of ENCODE to include essentially all transcription factors for measurements of occupancy and to produce transcriptomes from hundreds of very specific cell types, and even single cells. The specific plan is to: 1) determine genome wide occupancy for all transcription factors and major cofactors with high resolution in two or more cell types;2) map and quantify all messenger RNA transcripts, microRNAs and other non-ribosomal RNAs in more than 300 well-defined, uncultured cell types;3) map DNA methylation state genome-wide at nucleotide resolution in more than 300 cell types;and 4) apply a high-throughput transient transfection assay system to test the impact of -2,000 candidate regulatory elements on gene regulation. All experimental work in this project will be evaluated by appropriate quality metrics, and after quality control, all data will be rapidly deposited in publi, freely accessible genome databases. In addition, computational analyses, including evaluation of comparative and population genomics data, will be integrated with the experimental production to help ensure quality and to capture information in forms useful to biologists, genomicists, and medical researchers. Completion of these Specific Aims will enable biomedical researchers to better and more rapidly understand the consequences of mutations in genomic disorders, including cancer, cardiovascular disease, and almost ail common diseases and, therefore, to more fully realize the potential of genomics to impact human health.

Public Health Relevance

Interpreting the human genome sequence remains a daunting challenge. We cannot yet recognize genes and their regulatory elements based solely on primary DNA sequence. This ambitious ENCODE Project proposal leverages new advances in genomic technologies to radically improve the depth, breadth, and analysis of functional element annotations of the human genome, accelerating the impact on human health by tying the effects of mutations in functional elements to the misregulation of gene expression in disease.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Specialized Center--Cooperative Agreements (U54)
Project #
3U54HG006998-02S1
Application #
8709029
Study Section
Special Emphasis Panel (ZHG1-HGR-M (M1))
Program Officer
Feingold, Elise A
Project Start
2012-09-21
Project End
2016-07-31
Budget Start
2013-08-01
Budget End
2014-07-31
Support Year
2
Fiscal Year
2013
Total Cost
$360,178
Indirect Cost
$123,219
Name
Hudson-Alpha Institute for Biotechnology
Department
Type
DUNS #
780007410
City
Huntsville
State
AL
Country
United States
Zip Code
35806
Gasper, William C; Marinov, Georgi K; Pauli-Behn, Florencia et al. (2014) Fully automated high-throughput chromatin immunoprecipitation for ChIP-seq: identifying ChIP-quality p300 monoclonal antibodies. Sci Rep 4:5152
Conesa, Ana; Mortazavi, Ali (2014) The common ground of genomics and systems biology. BMC Syst Biol 8 Suppl 2:S1
Kellis, Manolis; Wold, Barbara; Snyder, Michael P et al. (2014) Defining functional DNA elements in the human genome. Proc Natl Acad Sci U S A 111:6131-8
Blanc, Valerie; Park, Eddie; Schaefer, Sabine et al. (2014) Genome-wide identification and functional analysis of Apobec-1-mediated C-to-U RNA editing in mouse small intestine and liver. Genome Biol 15:R79
Marinov, Georgi K; Williams, Brian A; McCue, Ken et al. (2014) From single-cell to cell-pool transcriptomes: stochasticity in gene expression and RNA splicing. Genome Res 24:496-510
Marinov, Georgi K; Wang, Yun E; Chan, David et al. (2014) Evidence for site-specific occupancy of the mitochondrial genome by nuclear transcription factors. PLoS One 9:e84713
Marinov, Georgi K; Kundaje, Anshul; Park, Peter J et al. (2014) Large-scale quality analysis of published ChIP-seq data. G3 (Bethesda) 4:209-23
Blobel, Gerd A; Hardison, Ross C (2013) A cluster to remember. Cell 154:718-20
Wang, Yun E; Marinov, Georgi K; Wold, Barbara J et al. (2013) Genome-wide analysis reveals coating of the mitochondrial genome by TFAM. PLoS One 8:e74513
Gertz, Jason; Savic, Daniel; Varley, Katherine E et al. (2013) Distinct properties of cell-type-specific and shared transcription factor binding sites. Mol Cell 52:25-36

Showing the most recent 10 out of 11 publications