The overall goal of this project is to explore and identify regular linguistic patterns within subject terminology in a medical bibliographic database. It explores the fundamental problem in information retrieval of reconciling and controlling the diversity of linguistic expressions for the same and related concepts in documents and queries. It specifically relates to the Unified Medical Language System (UMLS) effort to identify, extract and relate variant medical expressions in machine-readable bio-medical information resources. The research applies methods from empirical linguistics to the descriptive analysis of patterns in a sample of descriptors and documents in Medline. The guiding principles of the method are that it produce sets of textual elements which are linguistically related, and that the identified patterns be amenable to computational identification, extraction and manipulation. Evaluation criteria include measures of both the number and robustness of the identified patterns. The results can be applied to the design of automatic or semi-automatic vocabularies and mapping mechanisms which will aid the user in identifying and selecting correct terminology for their search requests.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
5R01LM005384-02
Application #
2237781
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Project Start
1992-02-01
Project End
1995-01-31
Budget Start
1993-02-01
Budget End
1995-01-31
Support Year
2
Fiscal Year
1993
Total Cost
Indirect Cost
Name
University of Michigan Ann Arbor
Department
Type
Other Domestic Higher Education
DUNS #
791277940
City
Ann Arbor
State
MI
Country
United States
Zip Code
48109