Critical Assessment of Information Extraction in Biology

Hirschman, Lynette

Abstract

The biological databases provide a rich source of data to serve as training data for statistical and machine learning approaches to text mining; they also provide expert-curated, "gold standard" data for evaluation of system performance. The strategy is to focus on problems of importance to working biologists, such as overcoming the curation bottleneck for biological literature, providing better mappings between biological ontologies and text, and giving biologists better access to textual information in both in the literature and in curated databases. This proposal focuses on development of mechanisms to promote progress in text mining to problems of biological significance. The short term focus is to continue work in organizing BioCreAtIvE: Critical Assessment for Information Extraction in Biology. The long term focus is to improve text mining tools to support expert curators in their cost-effective acquisition of information for biological databases, as well as to improve access to biological information via the use of shared semantics (ontologies), with particular focus on interactive tools and extraction of complex relations, such as host-pathogen or ecosystem interactions. The specific tasks proposed here are 1) running the Gene Normalization task for BioCreAtIvE II (to take place in 2006-2007) and analyzing and disseminating the data and results of the BioCreAtIvE II; 2) providing input into the creation of a Roadmap for BioCreAtIvE; 3) defining new evaluation tasks to meet needs of a wider range of biological curators; this will include an evaluation of interactive curation tools, done in conjunction with the RegCreative Jamboree; and methods for the representation and capture of complex biological relations, such as host-pathogen interaction and ecosystem interactions, in conjunction with standards consortia.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Type: Standard Grant (Standard)
Application #: 0640153
Program Officer: Sylvia J. Spengler

Project Start
Project End
Budget Start: 2006-10-01
Budget End: 2008-09-30
Support Year
Fiscal Year: 2006
Total Cost: $296,174
Indirect Cost

Critical Assessment of Information Extraction in Biology
Hirschman, Lynette
Mitre Corporation Virginia, McLean, VA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments