The biomedical research enterprise is incredibly productive, generating new knowledge at an unprecedented pace. However, as a community, we do a relatively poor job organizing and managing that knowledge so that it is maximally useful for the design and interpretation of other experiments. Scientific research is most efficient when new hypotheses are informed by the totality of past findings, and that scientific knowledge is Findable, Accessible, Interoperable, and Reusable (FAIR). Unfortunately the vast majority of research is published only in free-text, unstructured journal articles, rendering the findings very difficult to integrate and compute upon. This proposal describes the use of crowdsourcing to address this challenge in biomedical knowledge management. It specifically proposes to leverage Wikidata, which has the goal of creating a comprehensive knowledge base that both humans and computers can both read and edit. Wikidata is run by the same organization that runs Wikipedia, and like its sister project, it employs the principle of crowdsourcing to tackle a grand challenge in information management. Both Wikipedia and Wikidata invite and empower the community at large to collaboratively add, edit, and refine content. In this proposal, we continue our work to create the world's largest open and FAIR knowledge base of biomedical information within Wikidata. This proposal include three Specific Aims. First, we will improve both the quantity and quality of biomedical information in Wikidata. Quantity will be increased by loading several key biomedical vocabularies and ontologies, and data quality will be made more rigorous by the introduction of formal and computable data models. Second, we will facilitate and incentivize contributions of data by third- party data contributors.
This Aim will be achieved by extending our python programming library for reading from and writing to Wikidata, and by creating automated reports that notify resource providers when new relevant content is added or edited. Third, we will also seek to encourage contributions from domain experts using targeted incentives. Specifically, this aim will develop interfaces to Wikidata that provide integrated data reports that are otherwise unavailable, as well as extend the Gene Wiki Reviews series of invited reviews, which rewards contributions with traditional metrics of academic achievement. Finally, underlying these three Specific Aims will be a Driving Biological Project focusing in infectious disease research, which will ensure the tools and resources developed will have practical benefit to discovery-oriented research projects.
This proposal addresses the challenge of making all biomedical knowledge Findable, Accessible, Interoperable, and Reusable (FAIR). This work builds on Wikidata, a community-maintained knowledge base that can be read and edited by both humans and computers.
|Pecci, Alessandro; Ma, Xuefei; Savoia, Anna et al. (2018) MYH9: Structure, functions and role of non-muscle myosin IIA in human disease. Gene 664:152-167|
|Janes, Jeff; Young, Megan E; Chen, Emily et al. (2018) The ReFRAME library as a comprehensive drug repurposing library and its application to the treatment of cryptosporidiosis. Proc Natl Acad Sci U S A 115:10750-10755|
|Daniel, Dianne C; Johnson, Edward M (2018) PURA, the gene encoding Pur-alpha, member of an ancient nucleic acid-binding protein family with mammalian neurological functions. Gene 643:133-143|
|Schmidt, Laura S; Linehan, W Marston (2018) FLCN: The causative gene for Birt-Hogg-Dubé syndrome. Gene 640:28-42|
|Froimchuk, Eugene; Jang, Younghoon; Ge, Kai (2017) Histone H3 lysine 4 methyltransferase KMT2D. Gene 627:337-342|
|Lin, Dasheng; Alberton, Paolo; Caceres, Manuel Delgado et al. (2017) Tenomodulin is essential for prevention of adipocyte accumulation and fibrovascular scar formation during early tendon healing. Cell Death Dis 8:e3116|
|Chen, Kong; Kolls, Jay K (2017) Interluekin-17A (IL17A). Gene 614:8-14|
|Ghaleb, Amr M; Yang, Vincent W (2017) Krüppel-like factor 4 (KLF4): What we currently know. Gene 611:27-37|
|Griffith, Malachi; Spies, Nicholas C; Krysiak, Kilannin et al. (2017) CIViC is a community knowledgebase for expert crowdsourcing the clinical interpretation of variants in cancer. Nat Genet 49:170-174|
|Kumar, Rakesh; Sanawar, Rahul; Li, Xiaodong et al. (2017) Structure, biochemistry, and biology of PAK kinases. Gene 605:20-31|
Showing the most recent 10 out of 87 publications