The goal of the ENCODE Project is to provide the biomedical community with a complete and biologically interpretable annotation of the human genome. This means discovering and mapping all parts of all genes, including exons, introns, promoters and cis-regulatory sequences, in previous phases of the ENCODE Project, the applicants of this proposal developed and applied robust, high-throughput, genome-wide methods for determining transcription factor occupancy, assessing DNA methylation, identifying RNA transcripts, and experimentally testing candidate regulatory elements and mutations. The combination of experiences from the previous phases with the resulting technology and analysis platforms and the existing, highly productive infrastructure of the applicants form the basis of this response to NHGRI's RFA-HG-11-024 ("Expanding the Encyclopedia of DNA Elements (ENCODE) in the Human and Model Organisms"). This application presents an ambitious proposal to expand the biological dimensions of ENCODE to include essentially all transcription factors for measurements of occupancy and to produce transcriptomes from hundreds of very specific cell types, and even single cells. The specific plan is to: 1) determine genome wide occupancy for all transcription factors and major cofactors with high resolution in two or more cell types;2) map and quantify all messenger RNA transcripts, microRNAs and other non-ribosomal RNAs in more than 300 well-defined, uncultured cell types;3) map DNA methylation state genome-wide at nucleotide resolution in more than 300 cell types;and 4) apply a high-throughput transient transfection assay system to test the impact of -2,000 candidate regulatory elements on gene regulation. All experimental work in this project will be evaluated by appropriate quality metrics, and after quality control, all data will be rapidly deposited in publi, freely accessible genome databases. In addition, computational analyses, including evaluation of comparative and population genomics data, will be integrated with the experimental production to help ensure quality and to capture information in forms useful to biologists, genomicists, and medical researchers. Completion of these Specific Aims will enable biomedical researchers to better and more rapidly understand the consequences of mutations in genomic disorders, including cancer, cardiovascular disease, and almost ail common diseases and, therefore, to more fully realize the potential of genomics to impact human health.

Public Health Relevance

Interpreting the human genome sequence remains a daunting challenge. We cannot yet recognize genes and their regulatory elements based solely on primary DNA sequence. This ambitious ENCODE Project proposal leverages new advances in genomic technologies to radically improve the depth, breadth, and analysis of functional element annotations of the human genome, accelerating the impact on human health by tying the effects of mutations in functional elements to the misregulation of gene expression in disease.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Specialized Center--Cooperative Agreements (U54)
Project #
Application #
Study Section
Special Emphasis Panel (ZHG1-HGR-M (M1))
Program Officer
Feingold, Elise A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Hudson-Alpha Institute for Biotechnology
United States
Zip Code
Gasper, William C; Marinov, Georgi K; Pauli-Behn, Florencia et al. (2014) Fully automated high-throughput chromatin immunoprecipitation for ChIP-seq: identifying ChIP-quality p300 monoclonal antibodies. Sci Rep 4:5152
Conesa, Ana; Mortazavi, Ali (2014) The common ground of genomics and systems biology. BMC Syst Biol 8 Suppl 2:S1
Kellis, Manolis; Wold, Barbara; Snyder, Michael P et al. (2014) Defining functional DNA elements in the human genome. Proc Natl Acad Sci U S A 111:6131-8
Blanc, Valerie; Park, Eddie; Schaefer, Sabine et al. (2014) Genome-wide identification and functional analysis of Apobec-1-mediated C-to-U RNA editing in mouse small intestine and liver. Genome Biol 15:R79
Marinov, Georgi K; Williams, Brian A; McCue, Ken et al. (2014) From single-cell to cell-pool transcriptomes: stochasticity in gene expression and RNA splicing. Genome Res 24:496-510
Marinov, Georgi K; Wang, Yun E; Chan, David et al. (2014) Evidence for site-specific occupancy of the mitochondrial genome by nuclear transcription factors. PLoS One 9:e84713
Marinov, Georgi K; Kundaje, Anshul; Park, Peter J et al. (2014) Large-scale quality analysis of published ChIP-seq data. G3 (Bethesda) 4:209-23
Blobel, Gerd A; Hardison, Ross C (2013) A cluster to remember. Cell 154:718-20
Wang, Yun E; Marinov, Georgi K; Wold, Barbara J et al. (2013) Genome-wide analysis reveals coating of the mitochondrial genome by TFAM. PLoS One 8:e74513
Gertz, Jason; Savic, Daniel; Varley, Katherine E et al. (2013) Distinct properties of cell-type-specific and shared transcription factor binding sites. Mol Cell 52:25-36

Showing the most recent 10 out of 11 publications