The overall goal of this project is to promote the accessibility and dissemination of biomedical information so that the research community can better leverage existing knowledge. Science is most efficient when hypotheses are based on the entirety of knowledge available to date. Unfortunately, up-to-date and comprehensive access to relevant knowledge is rarely achieved. This proposals put a particular emphasis on illuminating biomedical ?dark data.? By analogy to the dark matter that is unaccounted for in the universe, dark data is defined by being unseen or underutilized by the scientific community. In this project, we will continuously strengthen our currently widely- used applications BioGPS and MyGene.info, and also develop two new applications: BioThings and BioReel. These applications, collectively, are targeted to make dark data resources Findable, Accessible, Interoperable, and Reusable (FAIR). BioGPS and BioReel are designed for non-computational scientists. BioGPS (http://biogps.org) is a gene portal for aggregating information on human genes and proteins. It illuminates dark data by creating a simple platform to discover and access gene-centric websites. BioGPS users can benefit each other by sharing the specific resources they discovered, and how they use or like them. BioReel will be developed as a tool to periodically monitor the relevant resources for researchers, and keep them notified when the knowledge about their genes of interest have been updated (e.g. new datasets available, annotated in a new pathway). MyGene.info and BioThings are designed for bioinformatics developers, who often face fragmented source data in terms of both the content and the heterogeneous formats. The significant amount of repetitive data-wrangling efforts has to be done by almost every bioinformaticians. We developed MyGene.info to integrate gene and protein annotation data into a simple and high performance web Application Programming Interface (API). It illuminates dark data on gene and protein annotations by pre-integrating over 200 annotation types in a standardized format. In this proposal, we will continue expand MyGene.info to include additional highly- requested annotations, both from a major data repository and smaller domain-specific data sources. In addition, we will generalize the infrastructure and the software pattern underlying the MyGene.info project, to make a generic API framework called the ?BioThings SDK?. Two new APIs will be built using this framework, focusing on drugs/chemicals and diseases respectively, where the data fragmentation across resources are equally a problem.

Public Health Relevance

BioGPS is a gene annotation portal that is widely used in the biomedical research community. This web application provides researchers integrated access to biomedical knowledge resources. This proposal will extend our support from genes to drugs and diseases, and also build a new application called BioReel to enable researchers to stay up-to-date on the latest knowledge relevant to their studies.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM083924-11
Application #
9685915
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Ravichandran, Veerasamy
Project Start
2008-08-01
Project End
2022-03-31
Budget Start
2019-04-01
Budget End
2020-03-31
Support Year
11
Fiscal Year
2019
Total Cost
Indirect Cost
Name
Scripps Research Institute
Department
Type
DUNS #
781613492
City
La Jolla
State
CA
Country
United States
Zip Code
92037
Putman, Tim E; Burgstaller-Muehlbacher, Sebastian; Waagmeester, Andra et al. (2016) Centralizing content and distributing labor: a community model for curating the very long tail of microbial genomes. Database (Oxford) 2016:
Xin, Jiwen; Mark, Adam; Afrasiabi, Cyrus et al. (2016) High-performance web services for querying gene and variant annotation. Genome Biol 17:91
Wu, Chunlei; Jin, Xuefeng; Tsueng, Ginger et al. (2016) BioGPS: building your own mash-up of gene annotations and expression profiles. Nucleic Acids Res 44:D313-6
Nicolas, Emmanuelle; Golemis, Erica A; Arora, Sanjeevani (2016) POLD1: Central mediator of DNA replication and repair, and implication in cancer and other pathologies. Gene 590:128-41
Burgstaller-Muehlbacher, Sebastian; Waagmeester, Andra; Mitraka, Elvira et al. (2016) Wikidata as a semantic framework for the Gene Wiki initiative. Database (Oxford) 2016:
Khare, Ritu; Good, Benjamin M; Leaman, Robert et al. (2016) Crowdsourcing in biomedicine: challenges and opportunities. Brief Bioinform 17:23-32
Song, Wei; Wang, Hao; Wu, Qingyu (2015) Atrial natriuretic peptide in cardiovascular biology and disease (NPPA). Gene 569:1-6
Zuehlke, Abbey D; Beebe, Kristin; Neckers, Len et al. (2015) Regulation and function of the human HSP90AA1 gene. Gene 570:8-16
Deneka, Alexander; Korobeynikov, Vladislav; Golemis, Erica A (2015) Embryonal Fyn-associated substrate (EFS) and CASS4: The lesser-known CAS protein family members. Gene 570:25-35
Dörfel, Max J; Lyon, Gholson J (2015) The biological functions of Naa10 - From amino-terminal acetylation to human disease. Gene 567:103-31

Showing the most recent 10 out of 26 publications