This proposal concerns the development and evaluation of computational methods through which linguistic manifestations of cognitive changes in Alzheimer?s Disease (AD) dementia can be identified in transcribed speech. Such methods are of value even prior to the availability of disease modifying treatments, as through earlier detection they provide the means to reduce the emotional and financial burden on patients, caregivers, and the healthcare system. Lack of a clear diagnosis in the face of cognitive manifestations of dementia can produce uncertainty, and negatively impact planning of future care. Misattributed AD symptoms can lead to social isolation. In addition, it is estimated that early and accurate diagnosis can help save an estimated $7.9 trillion in medical and care costs. With ~30-40% of healthy adults subjectively reporting forgetfulness on a regular basis, there is an urgent need to develop sensitive and specific, easy-to-use, safe, and cost-effective tools for monitoring AD-specific cognitive markers in individuals concerned about their cognitive function. Language reflects cognitive status, but manual analysis of language data is prohibitively time-consuming. In the proposed research we will develop and evaluate computational methods to identify linguistic biomarkers of AD, leveraging perplexity estimates derived from neural language models trained on transcripts of the speech of healthy controls only. This approach deviates from the supervised learning paradigm that characterizes most computational linguistics approaches to identifying AD, obviating the danger of overfitting to the characteristics of participants with dementia represented in the small datasets available for training. Nonetheless, our preliminary research has demonstrated that classification performance on the basis of such perplexity estimates rivals that documented with supervised machine learning models trained on hundreds of manually engineered features. The proposed research will result in a validated set of methods for detection of AD using transcribed speech, methods with the potential for broad dissemination on account of recent advances in automated speech recognition.
The need to monitor unintended effects of medications has been highlighted by several high-profile events in which fatal side effects of approved drugs were detected after their release to market. In the proposed research, we will develop and evaluate methods to identify biologically plausible adverse drug events using both observational data and knowledge extracted from the biomedical literature. If successful, these methods will provide the means for earlier detection of harmful drug effects, limiting consequent morbidity and mortality.