The goals of the ENCODE Data Coordinating (DCC) component to the ENCODE Database Coordination and Analysis Center are to support the ENCODE Consortium by defining and establishing pipelines that connect all participants to the data and by creating avenues of access that distribute these data to the greater biological research community. The ENCODE Consortium brings together laboratories that generate complex data types via experimental assays with laboratories that integrate these unique data using computational analyses to discover how chromosomal elements function together to define the human cell. The DCC's participation enhances the data created by these laboratories through the creation of structured pipelines for the verification and validation of all submitted data and providing processes for the documentation of metadata that describe each biological sample and assay method. To facilitate access to all the data created by the previous ENCODE projects as well as data from the modENCODE project and any other large data collections that are determined to be appropriate for incorporation, the DCC will construct a state of the art data storage repository called the Big Data Hub. The DCC will design and development new software to enhance the data submission and processing pipeline, the organization and access to metadata and the Big Data Hub. In addition, we will create the ENCODE Portal that will be the primary entry point to the wealth of experimentally determined information as well as results of computational analyses. The Portal will integrate these data resources and make them available via enhanced search and browsing capabilities. Tools will be implemented to aid discovery by both experienced bioinformaticians and naive laboratory staff. The DCC will evolve into a substantial service organization allowing biomedical research to take full advance of the ENCODE results. To this end the DCC will provide documentation via many media including written documentation, video tutorials, webinars, and meeting presentations. The DCC, DAC, and AWG will be tightly woven together to create the EDCAC.

Public Health Relevance

of this work for public health is that the comprehensive determination of functional elements encoded by the human genome is essential for understanding the nature of human health and the treatment of disease.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Biotechnology Resource Cooperative Agreements (U41)
Project #
3U41HG006992-02S1
Application #
8720219
Study Section
Special Emphasis Panel (ZHG1-HGR-M (M3))
Program Officer
Feingold, Elise A
Project Start
2012-09-21
Project End
2016-07-31
Budget Start
2013-08-01
Budget End
2014-07-31
Support Year
2
Fiscal Year
2013
Total Cost
$660,000
Indirect Cost
$123,254
Name
Stanford University
Department
Genetics
Type
Schools of Medicine
DUNS #
009214214
City
Stanford
State
CA
Country
United States
Zip Code
94305
Green, Richard E; Braun, Edward L; Armstrong, Joel et al. (2014) Three crocodilian genomes reveal ancestral patterns of evolution among archosaurs. Science 346:1254449
Karolchik, Donna; Barber, Galt P; Casper, Jonathan et al. (2014) The UCSC Genome Browser database: 2014 update. Nucleic Acids Res 42:D764-70