Biomedical and healthcare data sharing efforts are currently impaired by lack of (1) proper incentives and sharing tools for data producers, (2) practical frameworks for data standardization and indexing of data, and (3) effective data discovery mechanisms. BioCADDIE is a consortium of data producers, curators, publishers, and consumers who will work together to develop practical, sustainable solutions to the problem of biomedical and healthcare data discovery. Through task forces and corresponding pilot projects addressing the barriers enumerated above, we will promote open discussion of why millions of dollars are currently spent in the generation of data that remain captive at their origin or are shared in a sub-optimal way just to comply with mandates from funding agencies and scientific journals. We will promote the development of incentives, policies, and tools for data sharing and data discovery. We will engage researchers, clinicians, patients, and the community in general in an open dialogue focused on pros and cons of biomedical and clinical data sharing. BioCADDIE's specific aims are to: (1) Organize task forces with representatives from communities who have interest in data production, dissemination, and utilization. We will organize an annual symposium, workshops, Internet-based discussions among biomedical and clinical researchers, professional societies, journal publishers, funding agencies, clinicians, patients, and information scientists on best, sustainable practices for making data easily discoverable by different types of users. (2) Promote the development of realistic, minimal, friendly meta-data specifications and annotations for biomedical and healthcare data collections, and corresponding tools for automated indexing so that users will be able to locate data that are relevant to their specific free text searches. (3) Incubate new technologies by funding highly innovative, high-risk pilot research projects that enable the development of novel data discovery and indexing engines and have them tested by our diverse community of stakeholders. We only describe a small number of seed pilot projects in this proposal because BioCADDIE will solicit proposals for new pilot projects every year and select them through a review process involving the various stakeholder communities.

Public Health Relevance

Biomedical research and healthcare data are not fully utilized in part due to lack of incentives and tools to share these data in a way that makes it possible to reproduce results and make new discoveries. We will develop a consortium involving data producers, data disseminators, and data consumers (including patients) to develop tools and processes for easy discovery and access to data.

Agency
National Institute of Health (NIH)
Institute
National Institute of Allergy and Infectious Diseases (NIAID)
Type
Resource-Related Research Projects--Cooperative Agreements (U24)
Project #
3U24AI117966-03S1
Application #
9269344
Study Section
Special Emphasis Panel (ZRG1-IMST-L (52)R)
Program Officer
Lin, Dawei
Project Start
2014-09-29
Project End
2017-08-31
Budget Start
2016-09-15
Budget End
2017-08-31
Support Year
3
Fiscal Year
2016
Total Cost
$810,655
Indirect Cost
$184,341
Name
University of California San Diego
Department
Internal Medicine/Medicine
Type
Schools of Medicine
DUNS #
804355790
City
La Jolla
State
CA
Country
United States
Zip Code
92093
Wimalaratne, Sarala M; Juty, Nick; Kunze, John et al. (2018) Uniform resolution of compact identifiers for biomedical data. Sci Data 5:180029
Chen, Xiaoling; Gururaj, Anupama E; Ozyurt, Burak et al. (2018) DataMed - an open source discovery index for finding biomedical datasets. J Am Med Inform Assoc :
Wright, Theodore B; Ball, David; Hersh, William (2017) Query expansion using MeSH terms for dataset retrieval: OHSU at the bioCADDIE 2016 dataset retrieval challenge. Database (Oxford) 2017:
Perez-Riverol, Yasset; Bai, Mingze; da Veiga Leprevost, Felipe et al. (2017) Discovering and linking public omics data sets using the Omics Discovery Index. Nat Biotechnol 35:406-409
Sansone, Susanna-Assunta; Gonzalez-Beltran, Alejandra; Rocca-Serra, Philippe et al. (2017) DATS, the data tag suite to enable discoverability of datasets. Sci Data 4:170059
Dixit, Ram; Rogith, Deevakar; Narayana, Vidya et al. (2017) User needs analysis and usability assessment of DataMed - a biomedical data discovery index. J Am Med Inform Assoc :
Scerri, Antony; Kuriakose, John; Deshmane, Amit Ajit et al. (2017) Elsevier's approach to the bioCADDIE 2016 Dataset Retrieval Challenge. Database (Oxford) 2017:
Cohen, Trevor; Roberts, Kirk; Gururaj, Anupama E et al. (2017) A publicly available benchmark for biomedical dataset retrieval: the reference standard for the 2016 bioCADDIE dataset retrieval challenge. Database (Oxford) 2017:
Zong, Nansu; Lee, Sungin; Ahn, Jinhyun et al. (2017) Supporting inter-topic entity search for biomedical Linked Data based on heterogeneous relationships. Comput Biol Med 87:217-229
Ohno-Machado, Lucila; Sansone, Susanna-Assunta; Alter, George et al. (2017) Finding useful data across multiple biomedical data repositories using DataMed. Nat Genet 49:816-819

Showing the most recent 10 out of 14 publications