The broader impact of this Small Business Innovation Research (SBIR) Phase I project will result from improving the quality of healthcare and streamlining its delivery. The accumulation of clinical data has become a potentially valuable resource for clinical practice, as Electronic Medical Records (EMRs) contain information on day-to-day patient care. Latest Natural Language Processing (NLP) techniques applied to EMR data enable the development of health Intelligent Virtual Assistants (hIVAs) to assist healthcare professionals in incorporating evidence-based decision support, reducing errors and improving efficiency. Current most promising NLP approaches are underdeveloped for the clinical domain given the lack of high-quality annotated clinical data required for training, testing and validating the machine learning algorithms. As most EMR data is available as unstructured free text, software developers in Artificial Intelligence (AI) struggle to find these annotated texts. The proposed project will inform the production of high-quality hIVAs - from voice-based clinical AI chatbots for assisting physicians at the point of care to Question-Answering systems for clinical decision-making.

This Small Business Innovation Research (SBIR) Phase I project addresses the technical challenge of exploiting different combinations of Deep Learning (DL) structures for developing a novel set of annotation tools and an expert adjudication methodology to optimize the development of annotated corpora, specifically tailored for the clinical domain. The lack of these standard and annotated data sets is a major bottleneck preventing progress in clinical Information Extraction. Without these corpora, individual Natural Language Processing applications abound without the ability to train different algorithms, share and integrate modules, or compare performance. The company is leveraging the latest DL techniques to develop a unique architecture, able to identify a comprehensive set of context modifiers within unstructured clinical texts. This approach will boost the semi-automatic annotation of clinical corpora; produce accurate and robust annotated corpora; and reduce corpora production time and cost. The project objectives include: (1) adapting the existing in-house algorithm for automatic clinical text pre-annotation; (2) integrating a hybrid algorithm into a multi-user operable software platform for obtaining a minimum viable semi-automatic annotation product; (3) conducting a small pilot study to validate the performance of the resulting software platform and a Minimum Viable Product of an annotated corpus for diagnostic imaging reports.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2020-06-01
Budget End
2021-08-31
Support Year
Fiscal Year
2020
Total Cost
$224,961
Indirect Cost
Name
In Context Reporting Inc
Department
Type
DUNS #
City
Houston
State
TX
Country
United States
Zip Code
77005