For most model systems in biology, a large body of literature exists that describes what is known about that system. A fundamental problem faced by Model Organism Databases (MODs) is how to process this literature to extract the necessary information with minimum time and cost. Various aspects of literature curation can be facilitated by using software to assist curators in retrieving, prioritizing, curating, and tracking the articles. The Arabidopsis Information Resource (TAIR) and Rat Genome Database (RGD) developed different software components to facilitate in-house curation efforts and we propose to extend and integrate these components to build a powerful, portable, and generic literature curation system. The components of this system are: 1. a robust, extensible literature database search and retrieval module, PubFetch; 2. a robust, adaptable curation module to store, search, annotate and prioritize articles for MOD curation, PubSearch; 3. a literature tracking module to track and report the status of article curation, PubTrack. Our long-term goal is to develop a set of systematic procedures and tools for integrating knowledge from the confined context of a research article into the dynamic, broad context of a model organism database. By building upon the existing developments undertaken at both sites and working together to achieve an integrated literature curation solution, we will maximize the utility and flexibility of the system we create. The system will be central to the curation efforts of the collaborating databases and will be useable as a whole or in parts by other existing or emerging MODs.
Yoo, Danny; Xu, Iris; Berardini, Tanya Z et al. (2006) PubSearch and PubFetch: a simple management system for semiautomated retrieval and annotation of biological information from the literature. Curr Protoc Bioinformatics Chapter 9:Unit9.7 |