The proposed Center for Big Data in Translational Genomics will research, create and, importantly, help coordinate developments that will make the analysis of genomics assays routine and inexpensive, both individually and within large cohorts. Given the foundational nature of this Center it will provide tools to other NIH Big Data to Knowledge centers. Conversely, the Center will integrate tools and information from other centers, particularly within the context of the pilot projects that, though centered on omics, span biomedicine. The Center has a successful platform for supporting global collaborative research. Through this platform it will provide a solution for distributing data, metadata, and analysis to the larger consortium of centers so that they may interact effectively. This will make coordinating cross-center analysis simpler, and drive the development of common standards in metadata in different areas of biomedicine. Metadata is a critical but often-neglected data type, and is essential glue for effective collaboration. In addition, as continuous benchmarking to ascertain and track best-of-breed software tools is a primary aim of the Center, and as Center team members have extensive experience in running open challenges, the Center will provide to the consortium a common platform for running periodic challenges that carry specific rewards and recognition. This will help popularize important challenges across the consortium, attract new development teams and create heightened efforts in areas where tool development is most critically needed. Competitions also serve as outstanding training venues.

Public Health Relevance

Big data will impact many areas of biomedicine, as such it is vital that BD2K centers coordinate activities to maximise benefit and identify areas where fundamental platforms can be shared. This Center proposes using common platforms for scientific collaborations and methods for managing metadata, and using competitions to spur community involvement in critical tasks.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Specialized Center--Cooperative Agreements (U54)
Project #
1U54HG007990-01
Application #
8932073
Study Section
Special Emphasis Panel (ZRG1-BST-R (52))
Program Officer
Brooks, Lisa
Project Start
Project End
Budget Start
2014-07-01
Budget End
2015-06-30
Support Year
1
Fiscal Year
2014
Total Cost
$189,107
Indirect Cost
$19,080
Name
University of California Santa Cruz
Department
Type
DUNS #
125084723
City
Santa Cruz
State
CA
Country
United States
Zip Code
95064
Kronenberg, Zev N; Fiddes, Ian T; Gordon, David et al. (2018) High-resolution comparative analysis of great ape genomes. Science 360:
Jain, Miten; Olsen, Hugh E; Turner, Daniel J et al. (2018) Linear assembly of a human centromere on the Y chromosome. Nat Biotechnol 36:321-323
Garrison, Erik; Sirén, Jouni; Novak, Adam M et al. (2018) Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nat Biotechnol 36:875-879
Ellrott, Kyle; Bailey, Matthew H; Saksena, Gordon et al. (2018) Scalable Open Science Approach for Mutation Calling of Tumor Exomes Using Multiple Genomic Pipelines. Cell Syst 6:271-281.e7
Fiddes, Ian T; Armstrong, Joel; Diekhans, Mark et al. (2018) Comparative Annotation Toolkit (CAT)-simultaneous clade and personal genome annotation. Genome Res 28:1029-1038
Paten, Benedict; Eizenga, Jordan M; Rosen, Yohei M et al. (2018) Superbubbles, Ultrabubbles, and Cacti. J Comput Biol 25:649-663
Tyson, John R; O'Neil, Nigel J; Jain, Miten et al. (2018) MinION-based long-read sequencing and assembly extends the Caenorhabditis elegans reference genome. Genome Res 28:266-274
Jain, Miten; Koren, Sergey; Miga, Karen H et al. (2018) Nanopore sequencing and assembly of a human genome with ultra-long reads. Nat Biotechnol 36:338-345
Computational Pan-Genomics Consortium (2018) Computational pan-genomics: status, promises and challenges. Brief Bioinform 19:118-135
Kolmogorov, Mikhail; Armstrong, Joel; Raney, Brian J et al. (2018) Chromosome assembly of large and complex genomes using multiple references. Genome Res 28:1720-1732

Showing the most recent 10 out of 76 publications