The Gene Ontology Consortium produces a controlled vocabulary for annotation of gene functions, which has been adopted by many organism-specific gene annotation databases. This allows the prediction of gene function based on partial annotation: if two attributes are strongly correlated in a database, then the presence of one attribute is evidence for the presence of the other. Recent ideas from machine learning, such as dependency networks, may allow more complicated interdependencies between genes and their attributes to be modeled efficiently, which should enable better predictions to be made. Cross-validation will be used to assess the performance of these models, in comparison with linear models and baseline models in which attributes are assumed to be independent. This approach will also be integrated with a probabilistic model of annotation-transfer based on sequence similarity.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Postdoctoral Individual National Research Service Award (F32)
Project #
5F32HG002552-02
Application #
6626288
Study Section
Genome Study Section (GNM)
Program Officer
Graham, Bettie
Project Start
2002-05-02
Project End
Budget Start
2003-05-02
Budget End
2004-05-01
Support Year
2
Fiscal Year
2003
Total Cost
$46,420
Indirect Cost
Name
Harvard University
Department
Biochemistry
Type
Schools of Medicine
DUNS #
047006379
City
Boston
State
MA
Country
United States
Zip Code
02115
Sheng, Ning (2009) Aqua-[2-(2-pyridylmethyl-imino-meth-yl)phenolato]nickel(II) nitrate monohydrate. Acta Crystallogr Sect E Struct Rep Online 65:m1348
Tian, Weidong; Zhang, Lan V; Tasan, Murat et al. (2008) Combining guilt-by-association and guilt-by-profiling to predict Saccharomyces cerevisiae gene function. Genome Biol 9 Suppl 1:S7