Decades of experiments have produced vast amounts of data and identified a multitude of molecular processes that underlie specific biological functions directly relevant to human health. However, the potential of these data to inform about human health and disease have not yet been fully realized because publications report results in natural language that is not easily identifiable or computable. To capture and interrogate this wealth of data from the literature, we developed the BioGRID, an open repository for molecular interactions. BioGRID is a widely used resource, with on average over 6,500 unique visitors per month who explore the >360,000 interactions in the database with custom search and visualization tools. In addition, BioGRID data sets are the source of interaction information for a host of partner databases. An analogous challenge exists with the description of models of human disease. While much information is available from years of research in powerful models of human disease, including yeast, nematode, fly, zebrafish and mouse models, the relationship of these models to each other and to human disease has not been systematically organized. In this and other proposals connected through the Linking Animal Models to Human Disease Initiative (LAMHDl), we will undertake a systematic, coordinated effort to expand the BioGRID database through curation of pivotal new data compendia, application of sophisticated new methods for data integration, organization of data into predicted networks, and critically, linkage of networks between model systems and human disease processes. Our curation effort will comprehensively annotate RNAi phenotype data and chemical genetic data, which are crucial for accurate models of human disease and therapeutic intervention in disease, respectively. We will apply data analysis techniques to integrate these and other data across species to link human diseases with all relevant models to predict new features of human disease. We will also develop software tools to allow facile access of the research community to all of these results. Thus, we will enable the biomedical community to access fully comprehensive, integrated datasets across multiple models for hypothesis generation and analysis of human diseases.
(provided by applicant): We will collect a unique and extensive set of protein and gene interactions from models that are relevant to human disease, as well as their interactions with chemicals (drugs) and their effect on specific functions. These data will allow the prediction of new disease network functions using specialized algorithms, which will lead to a better understanding of human disease and facilitate the discovery of new drugs.