The purpose of this proposal is to develop two strategies, natural language processing (NLP) and automated speech analysis (ASA), to enable automated identification of patients with cognitive impairment (CI), from mild cognitive impairment (MCI) to Alzheimer?s Disease Related Dementias (ADRD) in clinical settings. The number of older adults in the United States with MCI and ADRD is increasing and yet the ability of clinicians and researchers to identify them at scale has advanced little over recent decades and screening with clinical assessments is done inconsistently. Alternative strategies using available data, like analysis of diagnostic codes in the clinical record or insurance claims, have very low sensitivity. NLP and ASA used with machine learning are technologies that could greatly increase ability to detect MCI and ADRD in clinical contexts. NLP automatically converts text in the electronic health record (EHR) into structured concepts suitable for analysis. Thus, clinicians? documentation of signs and symptoms or orders of tests and services that reflect or address cognitive limitations can be efficiently captured, possibly long before the clinician uses an ADRD-related diagnostic code. ASA directly measures cognition by recognizing different features of cognition captured in speech. Extracting features through both NLP and ASA could thus provide a unique measure of cognition and its impact on the individual and their caregivers. Early detection of MCI and ADRD can help researchers identify appropriate patients for research and help clinicians and health systems target patients for preventive care and care coordination. For these reasons, more efficient, highly scalable strategies are needed to identify people with MCI and ADRD.
The Specific Aims of this proposal are to (1) Develop and validate a ML algorithm using features extracted from the EHR with NLP to identify patients with CI, (2) Develop and validate a ML algorithm using features extracted from ASA of audio recordings of patient-provider encounters during routine primary care visits to identify patients with CI, (3) Develop and validate a ML algorithm using both NLP and ASA extracted features to create an integrated CI diagnostic algorithm. We will develop machine learning algorithms using NLP and ASA extracted features trained against neurocognitive assessment data on 800 primary care patients in New York City and validate them in an independent sample of 200 patients in Chicago. In secondary analyses we will train ML algorithms to identify MCI and its subtypes. This project will be the most rigorous development of NLP, ASA, and ML algorithms for CI yet performed, the first to test ASA in primary care settings, and the first to test NLP and ASA feature extraction strategies in combination. The multi-disciplinary team of clinicians, health services researchers, and neurocognitive and data scientists will apply machine learning to develop these highly scalable, automated technologies for identification of MCI and ADRD. 1

Public Health Relevance

The ability of clinicians, health systems and researchers to identify patients with mild cognitive impairment (MCI) and Alzheimer?s Disease Related Dementias (ADRD) is limited. This project will apply machine learning to natural language processing (NLP) of electronic health record data and automated speech analysis (ASA) of patient-doctor conversations during primary care visits to identify patients with MCI and ADRD using automated and scalable procedures. The analytic algorithms will be developed with neurocognitive assessment data on 800 primary care patients in New York City and validated in an independent sample of 200 patients in Chicago. 1

Agency
National Institute of Health (NIH)
Institute
National Institute on Aging (NIA)
Type
Research Project (R01)
Project #
1R01AG066471-01A1
Application #
9998610
Study Section
Biomedical Computing and Health Informatics Study Section (BCHI)
Program Officer
Wagster, Molly V
Project Start
2020-04-15
Project End
2025-03-31
Budget Start
2020-04-15
Budget End
2021-03-31
Support Year
1
Fiscal Year
2020
Total Cost
Indirect Cost
Name
Icahn School of Medicine at Mount Sinai
Department
Internal Medicine/Medicine
Type
Schools of Medicine
DUNS #
078861598
City
New York
State
NY
Country
United States
Zip Code
10029