The electronic medical record (EMR) holds great allure to both the medical informatics and health services research communities. In this project, we propose to enhance the capability of electronic medical record (EMR) systems by creating and evaluating tools to extract clinical vocabularies as well as patient data from narrative text reports. We will apply advanced natural language processing tools from the CLARIT system to both of the above problems. We contend that fast and robust automated text processing methods are the only way that the problems of vocabulary construction and narrative text extraction can be solved. We will address the clinical vocabulary problem by utilizing the thesaurus extraction techniques already present in the CLARIT system. Using several gigabytes of narrative text, including discharge summaries, progress notes, radiology reports, and other clinical text, we plan to: l. Identify empirically the terminology used in medicine. 2. Compare the coverage of that terminology in several existing large medical vocabularies: UMLS, SNOMED, and the Medical Entities Dictionary. 3. Discern the semantic characteristics of that terminology to allow other structured vocabularies a richer substrate of terms as well as providing us the opportunity to implement a clinical vocabulary schema based on the methods of the MedSORT-II Project. 4. Evaluate how well our tools assist the vocabulary building efforts of ourselves and others. The narrative extraction problem will be approached differently than in the past, building on the efforts of previous investigators who have tackled this problem before but changing the perspective by focusing on the development of tools specific to researchers and others with a need to extract data from narrative text. This approach will be applied in two domains: l.Consortium-based research in the use of esophogastroduodenoscopy (EGD). 2.Practice guidelines implementation in blood product transfusion.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project--Cooperative Agreements (U01)
Project #
1U01LM005879-01
Application #
2238284
Study Section
Special Emphasis Panel (SRC (99))
Project Start
1995-02-01
Project End
1998-01-31
Budget Start
1995-02-01
Budget End
1996-01-31
Support Year
1
Fiscal Year
1995
Total Cost
Indirect Cost
Name
Oregon Health and Science University
Department
Type
Other Domestic Higher Education
DUNS #
009584210
City
Portland
State
OR
Country
United States
Zip Code
97239
Hersh, W R; Donohoe, L C (1998) SAPHIRE International: a tool for cross-language information retrieval. Proc AMIA Symp :673-7
Hersh, W R; Leen, T K; Rehfuss, P S et al. (1998) Automatic prediction of trauma registry procedure codes from emergency room dictations. Medinfo 9 Pt 1:665-9
Hersh, W R; Campbell, E M; Malveau, S E (1997) Assessing the feasibility of large-scale natural language processing in a corpus of ordinary medical records: a lexical analysis. Proc AMIA Annu Fall Symp :580-4
Kreis, C; Gorman, P (1997) Word frequency analysis of dictated clinical data: a user-centered approach to the design of a structured data entry interface. Proc AMIA Annu Fall Symp :724-8
Spackman, K A; Hersh, W R (1996) Recognizing noun phrases in medical discharge summaries: an evaluation of two natural language parsers. Proc AMIA Annu Fall Symp :155-8
Evans, D A; Brownlow, N D; Hersh, W R et al. (1996) Automating concept identification in the electronic medical record: an experiment in extracting dosage information. Proc AMIA Annu Fall Symp :388-92
Ertle, A R; Campbell, E M; Hersh, W R (1996) Automated application of clinical practice guidelines for asthma management. Proc AMIA Annu Fall Symp :552-6
Hersh, W R; Campbell, E H; Evans, D A et al. (1996) Empirical, automated vocabulary discovery using large text corpora and advanced natural language processing tools. Proc AMIA Annu Fall Symp :159-63