The proposed Center for Big Data in Translational Genomics will research, create and, importantly, help coordinate developments that will make the analysis of genomics assays routine and inexpensive, both individually and within large cohorts. Given the foundational nature of this Center it will provide tools to other NIH Big Data to Knowledge centers. Conversely, the Center will integrate tools and information from other centers, particularly within the context of the pilot projects that, though centered on omics, span biomedicine. The Center has a successful platform for supporting global collaborative research. Through this platform it will provide a solution for distributing data, metadata, and analysis to the larger consortium of centers so that they may interact effectively. This will make coordinating cross-center analysis simpler, and drive the development of common standards in metadata in different areas of biomedicine. Metadata is a critical but often-neglected data type, and is essential glue for effective collaboration. In addition, as continuous benchmarking to ascertain and track best-of-breed software tools is a primary aim of the Center, and as Center team members have extensive experience in running open challenges, the Center will provide to the consortium a common platform for running periodic challenges that carry specific rewards and recognition. This will help popularize important challenges across the consortium, attract new development teams and create heightened efforts in areas where tool development is most critically needed. Competitions also serve as outstanding training venues.

Public Health Relevance

Big data will impact many areas of biomedicine, as such it is vital that BD2K centers coordinate activities to maximise benefit and identify areas where fundamental platforms can be shared. This Center proposes using common platforms for scientific collaborations and methods for managing metadata, and using competitions to spur community involvement in critical tasks.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Specialized Center--Cooperative Agreements (U54)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-BST-R (52))
Program Officer
Brooks, Lisa
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California Santa Cruz
Santa Cruz
United States
Zip Code
Kozanitis, Christos; Patterson, David A (2016) GenAp: a distributed SQL interface for genomic data. BMC Bioinformatics 17:63
Gordon, David; Huddleston, John; Chaisson, Mark J P et al. (2016) Long-read sequence assembly of the gorilla genome. Science 352:aae0344
Haeussler, Maximilian; Schönig, Kai; Eckert, Hélène et al. (2016) Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol 17:148
Speir, Matthew L; Zweig, Ann S; Rosenbloom, Kate R et al. (2016) The UCSC Genome Browser database: 2016 update. Nucleic Acids Res 44:D717-25
Jain, Miten; Olsen, Hugh E; Paten, Benedict et al. (2016) The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol 17:239
Yang, Shan; Cline, Melissa; Zhang, Can et al. (2016) DATA SHARING AND REPRODUCIBLE CLINICAL GENETIC TESTING: SUCCESSES AND CHALLENGES. Pac Symp Biocomput 22:166-176
Ip, Camilla L C; Loose, Matthew; Tyson, John R et al. (2015) MinION Analysis and Reference Consortium: Phase 1 data release and analysis. F1000Res 4:1075
Novak, Adam M; Rosen, Yohei; Haussler, David et al. (2015) Canonical, stable, general mapping using context schemes. Bioinformatics 31:3569-76
Philippakis, Anthony A; Azzariti, Danielle R; Beltran, Sergi et al. (2015) The Matchmaker Exchange: a platform for rare disease gene discovery. Hum Mutat 36:915-21
Paten, Benedict; Diekhans, Mark; Druker, Brian J et al. (2015) The NIH BD2K center for big data in translational genomics. J Am Med Inform Assoc 22:1143-7

Showing the most recent 10 out of 11 publications