The long-term aims of this project are to further medical text processing methodologies in order to broaden the availability of patient data for automated clinical applications. Presently, the most widespread, convenient, and comprehensive means for medical personnel to report clinical information in is natural language. A procedure that processes narrative text to extract and codify the clinical information contained in the text will enable automated decision support, quality assurance, and research applications to have reliable access to a much broader range of patient data that is presently possible. Enhancing the capabilities of automated research, quality assurance, and decision support will have a significant effect on the quality of patient care. It has already been demonstrated that clinical data can be successfully mapped into structured forms. Some text processing systems that have proved to be effective are applicable only for limited text; others have more sophisticated language capabilities but are not precise or reliable enough for clinical purposes.
A specific aim of this project is to integrate and enhance positive aspects of several different techniques (pattern matching, semantic-based, semantic and syntactic-based, and statistical) to create a robust and reliable processor that can be incrementally developed and extended within one uniform framework. Initially, the information to be extracted will be limited to that which is useful for a specific application: the decision support component of the Clinical Information System (CIS) at Columbia Presbyterian Medical Center (CMPC) This work is being done in conjunction with CPMC so that the effectiveness of this technique can be realistically evaluated in a clinical setting. Clinical data from radiology reports will be automatically codified and inserted into the clinical patient database at CPMC. This is possible because the schema of the relational patient clinical database is designed to accommodate the type of complex clinical information that is found in natural language. In order to integrate the output of the text processor with the automated applications using the data, a controlled vocabulary (with codes) for radiology will be developed and incorporated into the Medical Entities Dictionary (MED) at CPMC, which is an object oriented knowledge base of clinical terms. The output of the text processor will be in a form which is compatible with the definitions of terms in the MED. An automated procedure will be able to match the output forms with the MED forms of clinical terms in order to obtain precise codes for the data. The decision support application, as well as other computerized applications, will have reliable access to the clinical data in the patient database because they will only reference codes corresponding to terms in the MED.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
First Independent Research Support & Transition (FIRST) Awards (R29)
Project #
5R29LM005397-04
Application #
2237788
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Project Start
1992-01-01
Project End
1996-12-31
Budget Start
1995-01-01
Budget End
1995-12-31
Support Year
4
Fiscal Year
1995
Total Cost
Indirect Cost
Name
Queens College
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
City
Flushing
State
NY
Country
United States
Zip Code
11367
Elkins, J S; Friedman, C; Boden-Albala, B et al. (2000) Coding neuroradiology reports for the Northern Manhattan Stroke Study: a comparison of natural language processing and manual review. Comput Biomed Res 33:10-Jan
Friedman, C; Hripcsak, G (1999) Natural language processing and its future in medicine. Acad Med 74:890-5
Hripcsak, G; Kuperman, G J; Friedman, C et al. (1999) A reliability study for evaluating information extraction from radiology reports. J Am Med Inform Assoc 6:143-50
Hripcsak, G; Kuperman, G J; Friedman, C (1998) Extracting findings from narrative reports: software transferability and sources of physician disagreement. Methods Inf Med 37:1-7
Friedman, C; Hripcsak, G (1998) Evaluating natural language processors in the clinical domain. Methods Inf Med 37:334-44
Knirsch, C A; Jain, N L; Pablos-Mendez, A et al. (1998) Respiratory isolation of tuberculosis patients using clinical guidelines and an automated clinical decision support system. Infect Control Hosp Epidemiol 19:94-100
Jain, N L; Friedman, C (1997) Identification of findings suspicious for breast cancer based on natural language processing of mammogram reports. Proc AMIA Annu Fall Symp :829-33
Friedman, C (1997) Towards a comprehensive medical language processing system: methods and issues. Proc AMIA Annu Fall Symp :595-9
Jain, N L; Knirsch, C A; Friedman, C et al. (1996) Identification of suspected tuberculosis patients based on natural language processing of chest radiograph reports. Proc AMIA Annu Fall Symp :542-6
Hripcsak, G; Friedman, C; Alderson, P O et al. (1995) Unlocking clinical data from narrative reports: a study of natural language processing. Ann Intern Med 122:681-8

Showing the most recent 10 out of 15 publications