(Taken from application abstract): Integrated clinical data bases are the cornerstones of computerized clinical information systems. As these systems evolve to support more and more of the health care delivered in this country, the completeness and accuracy of these data bases will increase in importance. So too will the need to capture the clinical data that they store in a form that computers can analyze and manipulate, a coded form. Much of the information currently captured fails this criterion; it is collected as a variety of free-text medical documents. This natural language data can be interpreted only by human readers. This proposal describes research intended to develop a natural language understanding system specifically aimed at extracting relevant clinical facts from medical free-text. It builds on previous work in two domains. In the domain of radiology an application has been developed that encodes relevant clinical data using a model of the semantics of sentences. This project seeks to extend that work by developing additional contextual models whose focus is the semantics of the entire x-ray report. In the domain of diagnostic coding, previous work has demonstrated a promising approach to encoding free-text admitting diagnoses using a semantic model derived from the work in radiology. This project seeks to extend that work by developing techniques to 1) manage the need for regular training to update knowledge structures in this system and 2) extend the semantic model to assist in recognizing misspellings and variations on accepted abbreviations. As a part of this project, two systems will be developed to explore these two complimentary parts of the natural language understanding problem. These systems will undergo a sequence of testing procedures as a part of formative evaluations. Ultimately, the goal of this project is to further techniques that allow the encoding of medical information captured as free-text into a form appropriate for research, quality assurance, the management of medical enterprises, and direct clinical decision support.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Bean, Carol A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Ihc Health Services, Inc.
Salt Lake City
United States
Zip Code
Chapman, W W; Fizman, M; Chapman, B E et al. (2001) A comparison of classification algorithms to automatically identify chest X-ray reports that support pneumonia. J Biomed Inform 34:4-14
Chapman, W W; Fiszman, M; Frederick, P R et al. (2001) Quantifying the characteristics of unambiguous chest radiography reports in the context of pneumonia. Acad Radiol 8:57-66
Fiszman, M; Chapman, W W; Aronsky, D et al. (2000) Automatic detection of acute bacterial pneumonia from chest X-ray reports. J Am Med Inform Assoc 7:593-604
Fiszman, M; Haug, P J (2000) Using medical language processing to support real-time evaluation of pneumonia guidelines. Proc AMIA Symp :235-9
Chapman, W W; Aronsky, D; Fiszman, M et al. (2000) Contribution of a speech recognition system to a computerized pneumonia guideline in the emergency department. Proc AMIA Symp :131-5
Fiszman, M; Chapman, W W; Evans, S R et al. (1999) Automatic identification of pneumonia related concepts on chest x-ray reports. Proc AMIA Symp :67-71
Chapman, W W; Haug, P J (1999) Comparing expert systems for identifying chest x-ray reports that support pneumonia. Proc AMIA Symp :216-20
Chapman, W W; Haug, P J (1998) Bayesian modeling for linking causally related observations in chest X-ray reports. Proc AMIA Symp :587-91
Fiszman, M; Haug, P J; Frederick, P R (1998) Automatic extraction of PIOPED interpretations from ventilation/perfusion lung scan reports. Proc AMIA Symp :860-4