Complex protein and genetic interaction networks determine the properties of all biological systems and underlie human development, health and disease. Decades of biochemical, genetic and molecular biological experiments have identified myriad molecular processes that underpin specific biological functions, as documented in the primary biomedical literature. Recent technological innovations combined with complete genome sequence information have enabled various high-throughput (HTP) methods to generate protein and genetic interaction data on an unprecedented scale. Because human interaction networks are very frequently directly analogous to networks in tractable model organisms, it is essential that the hundreds of thousands of biological interactions that comprise these networks in across all major model organisms are archived in a well- annotated manner that is amenable to rigorous analysis. To capture, integrate and interrogate this wealth of data from both the literature and HTP datasets, we developed the BioGRID database as an open repository for protein and genetic interactions (www.thebiogrid.org). BioGRID is widely used by the biological and biomedical research community, with on average over 6,500 unique visitors per month who explore the over 330,000 interactions in BioGRID with the database search functions and visualization tools. In addition, the unique datasets in BioGRID are disseminated widely by a host of partner databases and applications. Here, we propose to markedly enhance the data content, the database architecture and the user interface of BioGRID. We will elaborate our comprehensive curation approach to main metazoan model organisms and humans, with an overall theme on conserved interaction networks that are implicated in human disease. As part of this curation effort, we will capture important attributes associated with protein and genetic interactions, including post-translational modifications, quantitative interaction data, allele information and interaction directionality. User access to these large datasets will be facilitated by data-rich interfaces, user-defined search and display parameters, customized user accounts and dynamically embedded network visualization datasets. All software will be open source and engineered toward compatibility and complementary with other academic database and software development efforts. The BioGRID will continue to provide its interaction data in their entirety and software tools to the model organism databases and other interested parties without restriction. The BioGRID platform will thus enable the biomedical and life sciences communities to access fully comprehensive datasets across multiple model organisms for hypothesis generation and network analysis.

Public Health Relevance

Through this work, we will provide a comprehensive database of protein and genetic interaction networks for multiple model organisms and humans, along with the requisite resources and tools to explore this information. This database, called the BioGRID, will lead to a better understanding of human disease by allowing the comparison of gene and protein functions in human health and disease to those in model organisms. The large amounts of data in the BioGRID will be provided too many other databases and users without restriction and will thereby facilitate both basic understanding and early phases of drug discovery.

Agency
National Institute of Health (NIH)
Institute
National Center for Research Resources (NCRR)
Type
Research Project (R01)
Project #
2R01RR024031-05
Application #
8041946
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Watson, Harold L
Project Start
2007-05-15
Project End
2016-01-31
Budget Start
2011-02-10
Budget End
2012-01-31
Support Year
5
Fiscal Year
2011
Total Cost
$895,794
Indirect Cost
Name
MT Sinai Hosp-Samuel Lunenfeld Research Institute
Department
Type
DUNS #
208808949
City
Toronto
State
ON
Country
Canada
Zip Code
M5 3-L9
Li, Yongsheng; Sahni, Nidhi; Pancsa, Rita et al. (2017) Revealing the Determinants of Widespread Alternative Splicing Perturbation in Cancer. Cell Rep 21:798-812
Kanshin, Evgeny; Giguère, Sébastien; Jing, Cheng et al. (2017) Machine Learning of Global Phosphoproteomic Profiles Enables Discrimination of Direct versus Indirect Kinase Substrates. Mol Cell Proteomics 16:786-798
Wildenhain, Jan; Spitzer, Michaela; Dolma, Sonam et al. (2016) Systematic chemical-genetic and chemical-chemical interaction datasets for prediction of compound synergism. Sci Data 3:160095
Oughtred, Rose; Chatr-aryamontri, Andrew; Breitkreutz, Bobby-Joe et al. (2016) Use of the BioGRID Database for Analysis of Yeast Protein and Genetic Interactions. Cold Spring Harb Protoc 2016:pdb.prot088880
Oughtred, Rose; Chatr-aryamontri, Andrew; Breitkreutz, Bobby-Joe et al. (2016) BioGRID: A Resource for Studying Biological Interactions in Yeast. Cold Spring Harb Protoc 2016:pdb.top080754
Liu, Guomin; Knight, James D R; Zhang, Jian Ping et al. (2016) Data Independent Acquisition analysis in ProHits 4.0. J Proteomics 149:64-68
Wildenhain, Jan; Spitzer, Michaela; Dolma, Sonam et al. (2015) Prediction of Synergism from Chemical-Genetic Interactions by Machine Learning. Cell Syst 1:383-95
Torii, Manabu; Li, Gang; Li, Zhiwen et al. (2014) RLIMS-P: an online text-mining tool for literature-based extraction of protein phosphorylation information. Database (Oxford) 2014:
Chatr-Aryamontri, Andrew; Breitkreutz, Bobby-Joe; Heinicke, Sven et al. (2013) The BioGRID interaction database: 2013 update. Nucleic Acids Res 41:D816-23
Sadowski, Ivan; Breitkreutz, Bobby-Joe; Stark, Chris et al. (2013) The PhosphoGRID Saccharomyces cerevisiae protein phosphorylation site database: version 2.0 update. Database (Oxford) 2013:bat026

Showing the most recent 10 out of 24 publications