Biomedical Ontology and Tools for Database Curation

Crangle, Colleen

Abstract

This proposal describes a new tool for text data mining-a biomedical language ontology and integrated natural-language-processing methods. Our long-term goal is to provide resources for biomedical knowledge discovery from text. Our immediate goal is to provide a knowledge discovery tool for the curation of organism databases such as the Genome Database (SGD). The proposed research not only serves the research needs of the SGD community, it also helps the broader biomedical community exploit the strengths of the comparative approach to biological research. The hypothesis of this proposal is that knowledge discovery from biomedical text requires a knowledge base that integrates both genomic and linguistic information. This hypothesis is based on two observations: (a) the language of biomedicine, like all natural language, is complex in structure and morphology (the basic units of meaning) and poses problems of synonymy (several terms having the same meaning), polysemy (a term having more than one meaning), hypernymy (one term being more general than another), hyponymy (one term being more specific than another), denotation (what a term refers to in contrast to what it means), and denotation and description (different ways of referring to the same thing); and (b) important biomedical knowledge sources, such as the Gene Ontology (GO), are expressed in natural language.
The specific aims of the proposed project are to: 1. Extend an existing biomedical language ontology to include genomic and linguistic data from SGD; 2. Use this ontology to discover, in full-text articles made available by SGD, information about the molecular function of yeast gene products that can be inferred from direct experimental assays; 3. Evaluate the effectiveness of the new tool and methods by comparing its results to those of the SGD curators for gene products that have GO functional annotations with evidence code IDA (Inferred from Direct Assay).

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Small Business Innovation Research Grants (SBIR) - Phase I (R43)
Project #: 1R43HG003600-01
Application #: 6885487
Study Section: Special Emphasis Panel (ZRG1-BDMA (01))
Program Officer: Bonazzi, Vivien

Project Start: 2005-03-11
Project End: 2006-09-30
Budget Start: 2005-03-11
Budget End: 2006-09-30
Support Year: 1
Fiscal Year: 2005
Total Cost: $99,250
Indirect Cost

Biomedical Ontology and Tools for Database Curation
Crangle, Colleen Elizabeth
Converspeech, LLC, Palo Alto, CA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments