Complex protein and genetic interaction networks determine the properties of all biological systems and underlie human development, health and disease. Decades of biochemical, genetic and molecular biological experiments have identified myriad molecular processes that underpin specific biological functions, as documented in the primary biomedical literature. Recent technological innovations combined with complete genome sequence information have enabled a host of high-throughput (HTP) methods to generate protein and genetic interaction data on an unprecedented scale. Because human interaction networks are often directly analogous to networks in tractable model organisms, it is essential that the hundreds of thousands of biological interactions discovered across the major model organisms, as well as humans, are archived in a well- annotated manner that is amenable to rigorous analysis and computation. To capture, integrate, and interrogate this wealth of data from both the literature and HTP datasets, we developed the BioGRID database as an open repository for protein and genetic interactions (www.thebiogrid.org). BioGRID is widely used by the biological and biomedical research community, with on average over 16,555 unique visitors per month in 2015. Using the search and visualization tools in BioGRID, these users explore the 971,027 total interactions that have been directly traced to experimental data in 45,603 publications by our curators. In addition, the unique datasets in BioGRID are disseminated widely by a host of partner databases, meta-databases, and applications. Here, we propose to markedly enhance the data content, the database architecture, and the user interface of BioGRID. We will expand the amount and types of data available through BioGRID, with a focus on interactions of central biological processes that are frequently perturbed in human disease. We will use new ontologies to systematically capture new data types, including CRISPR-based genetic interactions, structured phenotypes across all species, chemical and drug interactions, and post-translational modifications. Text- mining algorithms will be incorporated into the curation pipeline to enhance curation rates, and thereby substantially expand the coverage of the database. User access to the large datasets in BioGRID will be facilitated by data-rich interfaces, user-defined search and display parameters, and multiple methods of visualization. All software will continue to be open source and engineered toward compatibility and complementary with other academic database and software development efforts. The BioGRID will provide interaction data and software tools to model organism databases and other interested parties without restriction. The BioGRID resource will enable the biomedical research community to access validated biological interaction datasets across model organisms and humans for hypothesis generation and network analysis, and thereby further the general mission of the NIH.

Public Health Relevance

The BioGRID database is a comprehensive resource that provides protein and genetic interaction data for the major model organism species and humans, along with user-oriented tools to explore this information. The BioGRID facilitates better understanding of human disease by enabling inference of gene and protein function through network context and the computational comparison of these gene and protein networks in human health and disease to analogous networks mapped in model organisms. The large amounts of data in the BioGRID are freely provided to many other databases and users, thus facilitating both fundamental and translational research.

Agency
National Institute of Health (NIH)
Institute
Office of The Director, National Institutes of Health (OD)
Type
Research Project (R01)
Project #
2R01OD010929-10A1
Application #
9177106
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Watson, Harold L
Project Start
2007-05-15
Project End
2021-05-31
Budget Start
2016-07-01
Budget End
2017-05-31
Support Year
10
Fiscal Year
2016
Total Cost
Indirect Cost
Name
Sinai Health System
Department
Type
DUNS #
208808949
City
Toronto
State
ON
Country
Canada
Zip Code
M5 1X5
Bertomeu, Thierry; Coulombe-Huntington, Jasmin; Chatr-Aryamontri, Andrew et al. (2018) A High-Resolution Genome-Wide CRISPR/Cas9 Viability Screen Reveals Structural Features and Contextual Diversity of the Human Cell-Essential Proteome. Mol Cell Biol 38:
Chatr-Aryamontri, Andrew; Oughtred, Rose; Boucher, Lorrie et al. (2017) The BioGRID interaction database: 2017 update. Nucleic Acids Res 45:D369-D379
Courcelles, Mathieu; Coulombe-Huntington, Jasmin; Cossette, Émilie et al. (2017) CLMSVault: A Software Suite for Protein Cross-Linking Mass-Spectrometry Data Analysis and Visualization. J Proteome Res 16:2645-2652
Schapira, Matthieu; Tyers, Mike; Torrent, Maricel et al. (2017) WD40 repeat domain proteins: a novel target class? Nat Rev Drug Discov 16:773-786
Kanshin, Evgeny; Giguère, Sébastien; Jing, Cheng et al. (2017) Machine Learning of Global Phosphoproteomic Profiles Enables Discrimination of Direct versus Indirect Kinase Substrates. Mol Cell Proteomics 16:786-798
Islamaj Dogan, Rezarta; Kim, Sun; Chatr-Aryamontri, Andrew et al. (2017) The BioC-BioGRID corpus: full text articles annotated for curation of protein-protein and genetic interactions. Database (Oxford) 2017:
Kim, Sun; Islamaj Do?an, Rezarta; Chatr-Aryamontri, Andrew et al. (2016) BioCreative V BioC track overview: collaborative biocurator assistant task for BioGRID. Database (Oxford) 2016:
Oughtred, Rose; Chatr-aryamontri, Andrew; Breitkreutz, Bobby-Joe et al. (2016) BioGRID: A Resource for Studying Biological Interactions in Yeast. Cold Spring Harb Protoc 2016:pdb.top080754
Liu, Guomin; Knight, James D R; Zhang, Jian Ping et al. (2016) Data Independent Acquisition analysis in ProHits 4.0. J Proteomics 149:64-68
Wildenhain, Jan; Spitzer, Michaela; Dolma, Sonam et al. (2016) Systematic chemical-genetic and chemical-chemical interaction datasets for prediction of compound synergism. Sci Data 3:160095

Showing the most recent 10 out of 29 publications