The Biological sciences produce a complex research literature equivalent to thousands of megabytes every year. Research papers have extensive and detailed information about procedures, information not captured in keywords or tables. New techniques are developed for storing and querying this data and knowledge. This project combines the ongoing work in the Biological Knowledge Laboratory, which is amassing and codifying biology papers, with new technology for managing knowledge and databases. The focus is on the logical text structure (title, abstract, sections, paragraphs) and the Materials and Methods sections (chemicals used, genetic strains, equipment, procedures). The text structure can be treated as a schema in an object-oriented data model. The Materials and Methods database is a complex knowledge base involving time-ordered events, computed rather than stored attributes, references to methods in other papers, etc. The techniques and concepts of object-oriented databases are used for both types of information. Innovative indexing schemes are developed to aid in search. This new object-oriented database technology advances the understanding of the structure of scientific knowledge as data and provides new tools for research scientists to access the complex knowledge contained in the research literature.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
9117030
Program Officer
Program Director
Project Start
Project End
Budget Start
1992-08-01
Budget End
1997-01-31
Support Year
Fiscal Year
1991
Total Cost
$680,578
Indirect Cost
Name
Northeastern University
Department
Type
DUNS #
City
Boston
State
MA
Country
United States
Zip Code
02115