The Gene Ontology Consortium produces a controlled vocabulary for annotation of gene functions, which has been adopted by many organism-specific gene annotation databases. This allows the prediction of gene function based on partial annotation: if two attributes are strongly correlated in a database, then the presence of one attribute is evidence for the presence of the other. Recent ideas from machine learning, such as dependency networks, may allow more complicated interdependencies between genes and their attributes to be modeled efficiently, which should enable better predictions to be made. Cross-validation will be used to assess the performance of these models, in comparison with linear models and baseline models in which attributes are assumed to be independent. This approach will also be integrated with a probabilistic model of annotation-transfer based on sequence similarity.
Sheng, Ning (2009) Aqua-[2-(2-pyridylmethyl-imino-meth-yl)phenolato]nickel(II) nitrate monohydrate. Acta Crystallogr Sect E Struct Rep Online 65:m1348 |
Tian, Weidong; Zhang, Lan V; Tasan, Murat et al. (2008) Combining guilt-by-association and guilt-by-profiling to predict Saccharomyces cerevisiae gene function. Genome Biol 9 Suppl 1:S7 |