The major goal of this project is to develop and evaluate innovative approaches to information retrieval (IR) in the biomedical domain. Building on the initial work done by the Principal Investigator (PI) with the SAPHIRE Project and taking advantage of the efforts of the Unified Medical Language Systems (UMLS) Project, we aim to design and test new methods for automated indexing and retrieval. The underlying thesis of the SAPHIRE approach to IR is movement of information representation from the level of terms to that of concepts. Terms, such as MeSH entries, are surface string representations of underlying concepts. A problem with their use in representing concepts is that they cannot account for the different ways a concept may be expressed in medical texts or information system queries. SAPHIRE is a first step in the direction of concept-based IR, and we plan to investigate several enhancements to this approach. The major goal will be achieved with six separate but interrelated tasks: 1. Develop methodology for evaluation of IR systems in laboratory and clinical settings. 2. Assess the utility of computational linguistic approaches to concept discovery in text using constrained natural language processing and knowledge base construction. 3. Refine strategies for automated indexing of a wide variety of textual material, including abstracts, full text of articles, textbooks, and hypertext. 4. Explore different user interfaces, aiming to allow optimal retrieval for both novice and expert users. 5. Assess the use of semantic relationships between concepts in indexing and retrieval. 6. Integrate the SAPHIRE approach with other programs, such as the CODEX system and Explorer-2, and scale up to large text collections. In the course of the project we will create an IR system that will help meet the information needs of busy health care providers. Such a system should have a diverse variety of content available as well as quality indexing to represent content accurately. It should also feature retrieval capability that is fast and easy to use. In this grant, we propose to iteratively build an IR system that utilizes concept-based probabilistic indexing and retrieval, and evaluate it each step along the way in laboratory as well as real world settings.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
First Independent Research Support & Transition (FIRST) Awards (R29)
Project #
5R29LM005307-02
Application #
3474526
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Project Start
1991-06-01
Project End
1996-05-31
Budget Start
1992-06-01
Budget End
1993-05-31
Support Year
2
Fiscal Year
1992
Total Cost
Indirect Cost
Name
Oregon Health and Science University
Department
Type
Other Domestic Higher Education
DUNS #
009584210
City
Portland
State
OR
Country
United States
Zip Code
97239
Hersh, W R; Brown, K E; Donohoe, L C et al. (1996) CliniWeb: managing clinical information on the World Wide Web. J Am Med Inform Assoc 3:273-80
Hersh, W; Leone, T J (1995) The SAPHIRE server: a new algorithm and implementation. Proc Annu Symp Comput Appl Med Care :858-62
Hersh, W; Hickam, D (1994) Use of a multi-application computer workstation in a clinical setting. Bull Med Libr Assoc 82:382-9
Hersh, W R; Hickam, D H; Haynes, R B et al. (1994) A performance and failure analysis of SAPHIRE with a MEDLINE test collection. J Am Med Inform Assoc 1:51-60
Hersh, W R; Elliot, D L; Hickam, D H et al. (1994) Towards new measures of information retrieval evaluation. Proc Annu Symp Comput Appl Med Care :895-9
Hersh, W R; Hickam, D H (1993) A comparison of two methods for indexing and retrieval from a full-text medical database. Med Decis Making 13:220-6
Hersh, W R; Hickam, D H; Leone, T J (1992) Words, concepts, or both: optimal indexing units for automated information retrieval. Proc Annu Symp Comput Appl Med Care :644-8