The goals of the ENCODE Data Coordinating (DCC) component to the ENCODE Database Coordination and Analysis Center are to support the ENCODE Consortium by defining and establishing pipelines that connect all participants to the data and by creating avenues of access that distribute these data to the greater biological research community. The ENCODE Consortium brings together laboratories that generate complex data types via experimental assays with laboratories that integrate these unique data using computational analyses to discover how chromosomal elements function together to define the human cell. The DCC's participation enhances the data created by these laboratories through the creation of structured pipelines for the verification and validation of all submitted data and providing processes for the documentation of metadata that describe each biological sample and assay method. To facilitate access to all the data created by the previous ENCODE projects as well as data from the modENCODE project and any other large data collections that are determined to be appropriate for incorporation, the DCC will construct a state of the art data storage repository called the Big Data Hub. The DCC will design and development new software to enhance the data submission and processing pipeline, the organization and access to metadata and the Big Data Hub. In addition, we will create the ENCODE Portal that will be the primary entry point to the wealth of experimentally determined information as well as results of computational analyses. The Portal will integrate these data resources and make them available via enhanced search and browsing capabilities. Tools will be implemented to aid discovery by both experienced bioinformaticians and naive laboratory staff. The DCC will evolve into a substantial service organization allowing biomedical research to take full advance of the ENCODE results. To this end the DCC will provide documentation via many media including written documentation, video tutorials, webinars, and meeting presentations. The DCC, DAC, and AWG will be tightly woven together to create the EDCAC.

Public Health Relevance

of this work for public health is that the comprehensive determination of functional elements encoded by the human genome is essential for understanding the nature of human health and the treatment of disease. PROJECT/

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Biotechnology Resource Cooperative Agreements (U41)
Project #
5U41HG006992-03
Application #
8724542
Study Section
Special Emphasis Panel (ZHG1-HGR-M (M3))
Program Officer
Feingold, Elise A
Project Start
2012-09-21
Project End
2016-07-31
Budget Start
2014-08-01
Budget End
2015-07-31
Support Year
3
Fiscal Year
2014
Total Cost
$2,648,073
Indirect Cost
$887,794
Name
Stanford University
Department
Genetics
Type
Schools of Medicine
DUNS #
009214214
City
Stanford
State
CA
Country
United States
Zip Code
94305
Gabdank, Idan; Chan, Esther T; Davidson, Jean M et al. (2018) Prevention of data duplication for high throughput sequencing repositories. Database (Oxford) 2018:
Davis, Carrie A; Hitz, Benjamin C; Sloan, Cricket A et al. (2018) The Encyclopedia of DNA elements (ENCODE): data portal update. Nucleic Acids Res 46:D794-D801
Hitz, Benjamin C; Rowe, Laurence D; Podduturi, Nikhil R et al. (2017) SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata. PLoS One 12:e0175310
Speir, Matthew L; Zweig, Ann S; Rosenbloom, Kate R et al. (2016) The UCSC Genome Browser database: 2016 update. Nucleic Acids Res 44:D717-25
Sloan, Cricket A; Chan, Esther T; Davidson, Jean M et al. (2016) ENCODE data at the ENCODE portal. Nucleic Acids Res 44:D726-32
Hong, Eurie L; Sloan, Cricket A; Chan, Esther T et al. (2016) Principles of metadata organization at the ENCODE data coordination center. Database (Oxford) 2016:
Malladi, Venkat S; Erickson, Drew T; Podduturi, Nikhil R et al. (2015) Ontology application and use at the ENCODE DCC. Database (Oxford) 2015:
Rosenbloom, Kate R; Armstrong, Joel; Barber, Galt P et al. (2015) The UCSC Genome Browser database: 2015 update. Nucleic Acids Res 43:D670-81
Karolchik, Donna; Barber, Galt P; Casper, Jonathan et al. (2014) The UCSC Genome Browser database: 2014 update. Nucleic Acids Res 42:D764-70
Green, Richard E; Braun, Edward L; Armstrong, Joel et al. (2014) Three crocodilian genomes reveal ancestral patterns of evolution among archosaurs. Science 346:1254449

Showing the most recent 10 out of 11 publications