The applicant's long-term career goal is to become an independent investigator in biomedical informatics research. She will dedicate her career to developing and evaluating methodologies for improving the processes and outcomes of healthcare using data locked in textual documents. This career development award will provide her with initial support for achieving her career goals. The applicant has three goals for her career development over the next three years. First, she will compare the performance of different machine learning techniques at detecting patients with a respiratory syndrome. The proposed research will expand the state-of-the-art syndromic surveillance capabilities by integrating findings, symptoms, and diseases described in textual medical records. The product of the first research goal will be a model for respiratory syndromic case detection for monitoring natural and bioterrorism induced respiratory outbreaks. Second, she will apply existing methods and develop new techniques for extracting clinical conditions required for respiratory case detection from emergency department notes, contributing new knowledge to the medical language processing field using sentence and report level models that account for uncertainty, negation, and temporal occurrence. The product of the second goal will be a better understanding of the information required for accurate detection of respiratory related conditions from text and useful tools for automatically extracting that information. Third, she will teach, promote, and facilitate the use of natural language processing in the biomedical informatics field. The product of the third goal will be a graduate class surveying medical language processing methodology and applications and development of general tools sets for researchers who need encoded data from textual patient records. The proposed research will focus on:
Aim 1. Development and evaluation of a respiratory case detection model;
Aim 2. Integration of existing natural language processing tools and development of new methodologies for extracting clinical conditions needed for respiratory case detection from textual records;
and Aim 3. Comparison of existing syndromic detection algorithms that use admit data against the same algorithms using conditions extracted from textual reports.