This Small Business Innovative Research SBIR Phase I project will use statistical analysis of historical medical records to create families of language models for each section of the traditional medical note and switch lexicons in and out of the automatic speech recognition (ASR) in real time based on the contextual position within the narrative note. Current speech recognition methods use a single, general-purpose medical lexicon to train a recognizer when identifying words. Medical context-specific probabilities are ignored. Because the DocTalk engine incorporates real time integrated ASR with natural language processing (NLP), there is an opportunity to utilize NLP contextual data to actually change the ongoing ASR process. This innovative text structuring method will exploit the statistical variability of language used in each section of the medical record. It is a unique opportunity to address delay in workflow; the largest barrier to a national electronic healthcare infrastructure, by using a cloud-based, open source leveraged solution.

The broader impact/commercial potential of this project includes the ability of physicians to increase usable information, avoid third party transcription errors, and mitigate workflow delays. The majority of workflow delay in electronic medical records (EMR) is the need to perform manual operations to fill structured forms within the record, as opposed to simple unstructured narratives used in traditional written notes and transcriptions. Successful completion of this innovative proposed program of NLP-enhanced context based ASR will provide the accuracy required to deploy an integrated, interactive, intuitive, low-cost data entry system for small practice primary care physicians, and help overcome the largest obstacle to a national electronic healthcare infrastructure.

Project Report

Background Healthcare quality improvement requires high quality input data to underlie any intervention. The adoption of electronic health records (EHR) has encouraged healthcare organizations (HCO) to collect data at the point of care, within minutes of the patient encounter. Current methods of capturing data using dropdowns and checkboxes have improved discrete data capture, but have degraded the breadth and quality of information about the patient and the medical encounter. Automatic speech recognition (ASR) offers broad content, but adoption has been limited due to mixed accuracy and reliability when used in the healthcare environment. ASR alone has also failed to solve the challenge of making documentation at the point of care machine consumable and sufficiently robust to address national quality initiatives, including ICD-10 billing codes, meaningful use quality measures, and accountable care. Approach The Phase I effort focused exclusively on the feasibility of improving accuracy to assure adequately robust input content, as downstream processing is highly dependent on input quality. ASR typically leverages very localized context to improve accuracy. This Phase I project tested an ASR system that includes natural language processing (NLP) to add medical context-specific recognition that applies a refined statistical language model (SLM) to the recognition hypotheses of the ASR. Objectives The purpose of this Phase I SBIR project was to prove the concept of high-quality context-based data capture through the combination of context recognition and speech recognition. Specifically, the goal was to increase the accuracy of ASR for History and Physical (H&P) notes by at least 5% through context recognition. To achieve this goal, the following project objectives were created: Build tools to allow voice-based capture and processing of information leveraging NLP-determined context within H&P notes. Using the described tools, process H&P notes and compare the total word error rate between traditional ASR (tASR) and context sensitive ASR (csASR). Methods Standard transcription materials in the form of H&P notes were created to act as gold standard canonical documents. These documents were based on real clinical encounter notes that had been de-identified and edited to abstract details. A software tool was developed to allow study subjects to train the ASR to their voice and then dictate the canonical study notes. 13 physicians with previous experience using dictation were recruited as study subjects. An automated study system was then created to automatically capture the following data sets from the study subjects: Audio training data set recordings Dictations of the 10 H&P notes These recordings were captured and processed using various software components. Content from three physicians was used for system testing and improvement. Subsequently, content from the remaining 10 physicians was analyzed through the system to compare tASR and csASR word error rate (WER) in each section of the H&P note. Content from the testing set was not used in final results. Summary and Conclusions The csASR developed in this Phase I program provided a 78.43% mean accuracy compared with 57.69% mean accuracy for tASR (p-value <0.0001). This represents an absolute accuracy improvement of 21%, meeting the goal of performance improvement of at least 5% in word error rate. The csASR showed improvements over the tASR in most categories of the H&P note, with particular advances in the context of language constrained contexts. In creating a proof of concept, no effort was made to tune the engine in either the tASR or csASR approach. Thus, accuracy numbers in both approaches are lower than those seen in commercial healthcare ASR applications. The pathway to tune the system is known, but this tuning and hardening is not proven in Phase I. Further development and refinement of the system in the Phase II program will tune the system and is expected to increase overall accuracy. The performance improvement, if achieved within a commercial system, will support robust accurate content capture to feed the feedback loops required for real-time documentation improvement. The ultimate goal is an integrated, interactive, intuitive, low-cost data-entry, extraction, and processing system to provide the content needed for data-driven healthcare, a national effort to improve the quality and reduce the costs of care.

Project Start
Project End
Budget Start
2012-01-01
Budget End
2012-12-31
Support Year
Fiscal Year
2011
Total Cost
$150,000
Indirect Cost
Name
Vmt, Inc.
Department
Type
DUNS #
City
Newark
State
DE
Country
United States
Zip Code
19711