The objective of the Encyclopedia of DNA Elements (ENCODE) Project is to provide a complete inventory of all functional elements in the human genome using high-throughput experiments as well as computational methods. This proposal aims to create the ENCODE Data Analysis Center (EDAC, or the DAC), consisting of a multi-disciplinary group of leading scientists who will respond to directions from the Analysis Working Group (AWG) of ENCODE and thus integrate data generated by all groups in the ENCODE Consortium in an unbiased manner. These analyses will substantially augment the value of the ENCODE data by integrating diverse data types. The DAC members are leaders in their respective fields of bioinformatics, computational machine learning, algorithm development, and statistical theory and application to genomic data (Zhiping Weng, Manolis Kellis, Mark Gerstein, Mark Daly, Roderic Guigo, Shirley Liu, Rafael Irizarry, and William Noble). They have a strong track record of delivering collaborative analysis in the context of the ENCODE and modENCODE Projects, in which this group of researchers was responsible for the much of the analyses and the majority of the figures and tables in the ENCODE and modENCODE papers. The proposed DAC will pursue goals summarized as the following seven aims:
Aim 1. To work with the AWG to define and prioritize integrative analyses of ENCODE data;
Aim 2. To provide shared computational guidelines and infrastructure for data processing, common analysis tasks, and data exchange;
Aim 3. To facilitate and carry out data integration for element-specific analyses;
Aim 4. To facilitate and carry out exploratory data analyses across elements;
Aim 5. To facilitate and carry out comparative analyses across human, mouse, fly, and worm;
Aim 6. To facilitate integration with the genome-wide association studies community and disease datasets;
and Aim 7. To facilitate writing Consortium papers and assist evaluating ENCODE data.

Public Health Relevance

The Encyclopedia of DNA Elements (ENCODE) Project is a coordinated effort to apply high-throughput, cost-efficient approaches to generate a comprehensive catalog of functional elements in the human genome. This proposal establishes a data analysis center to support, facilitate, and enhance integrative analyses of the ENCODE Consortium, with the ultimate goal of facilitating the scientific and medical communities in interpreting this human genome and using it to understand human biology and improve human health.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Biotechnology Resource Cooperative Agreements (U41)
Project #
1U41HG007000-01
Application #
8402447
Study Section
Special Emphasis Panel (ZHG1-HGR-M (M3))
Program Officer
Pazin, Michael J
Project Start
2012-09-21
Project End
2016-07-31
Budget Start
2012-09-21
Budget End
2013-07-31
Support Year
1
Fiscal Year
2012
Total Cost
$2,460,045
Indirect Cost
$329,830
Name
University of Massachusetts Medical School Worcester
Department
Biostatistics & Other Math Sci
Type
Schools of Medicine
DUNS #
603847393
City
Worcester
State
MA
Country
United States
Zip Code
01655
Muir, Paul; Li, Shantao; Lou, Shaoke et al. (2016) The real cost of sequencing: scaling computation to keep pace with data generation. Genome Biol 17:53
Wang, Su; Zang, Chongzhi; Xiao, Tengfei et al. (2016) Modeling cis-regulation with a compendium of genome-wide histone H3K27ac profiles. Genome Res 26:1417-1429
Jungreis, Irwin; Chan, Clara S; Waterhouse, Robert M et al. (2016) Evolutionary Dynamics of Abundant Stop Codon Readthrough. Mol Biol Evol 33:3108-3132
Lagarde, Julien; Uszczynska-Ratajczak, Barbara; Santoyo-Lopez, Javier et al. (2016) Extension of human lncRNA transcripts by RACE coupled with long-read high-throughput sequencing (RACE-Seq). Nat Commun 7:12339
Ernst, Jason; Melnikov, Alexandre; Zhang, Xiaolan et al. (2016) Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions. Nat Biotechnol 34:1180-1190
Zang, Chongzhi; Wang, Tao; Deng, Ke et al. (2016) High-dimensional genomic data bias correction and data integration using MANCIE. Nat Commun 7:11305
Han, Bo W; Wang, Wei; Li, Chengjian et al. (2015) Noncoding RNA. piRNA-guided transposon cleavage initiates Zucchini-dependent, phased piRNA production. Science 348:817-21
Ernst, Jason; Kellis, Manolis (2015) Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues. Nat Biotechnol 33:364-76
Ay, Ferhat; Noble, William S (2015) Analysis methods for studying the 3D architecture of the genome. Genome Biol 16:183
Gao, Ming; Thomson, Travis C; Creed, T Michael et al. (2015) Glycolytic enzymes localize to ribonucleoprotein granules in Drosophila germ cells, bind Tudor and protect from transposable elements. EMBO Rep 16:379-86

Showing the most recent 10 out of 38 publications