Computational prediction of gene function (or phenotype) can reduce the scale of an experimental problem by focusing attention on a subset of possible experiments. Current function annotation databases-e.g., the Saccharomyces Genome Database (SGD) annotation of genes with Gene Ontology (GO) functions-are critically important resources, but were not designed to host computational predictions Although SGD and several other annotation databases label predictions as such, they provide no measures of confidence. The need exists for quantitative predictions, as distinct from qualitative """"""""somebody said so"""""""" predictions. Probabilistic scoring systems, in which the score communicates the probability of veracity, are likely to be the most useful. We will generate probabilistic predictions by developing probabilistic models for predicting function and phenotype. For reasons of data availability, we use S. cerevisiae and C. elegans as model systems. We will also generate probabilistic models to predict protein and genetic interactions. We will exploit probabilistic networks of protein and genetic interaction in several ways. We will apply ideas from communication theory (2-terminal network reliability) to predict new members of protein complexes from probabilistic protein networks. We will develop computational methods to guide efficient discovery of genetic interactions in S. cerevisiae, as a model for guiding future high-throughput studies in metazoans. We will exploit probabilistic synthetic lethal interaction networks to identify drug mechanism of action. We will disseminate predictions to the broader biomedical community. We propose a distributed quantitative prediction resource inspired by the DAS system of distributed genome annotation. We will adapt previously developed interfaces for browsing, searching, and retrieving probabilistic annotations to enhance their utility.
In Aim 1. we develop, apply, and validate methods for predicting function, phenotype, physical and genetic interaction in S. cerevisiae and C. elegans.
In Aim 2. we exploit probabilistic networks of protein and genetic interaction in S. cerevisiae to elucidate network structure, to guide functional genomic experiments, and to reveal drug mechanism of action.
In Aim, 3. we disseminate probabilistic predictions within a simple, generic, distributed software framework for sharing and browsing quantitative predictions.
Showing the most recent 10 out of 46 publications