Losing the capacity to communicate through language has a significant negative impact on a person?s autonomy, social interactions, occupation, mental health, and overall quality of life. Many people lose the capacity to speak and write but keep their thinking intact. Inner speech is internally and willfully generated, non-articulated verbal thoughts (e.g., reading in silence). Changes in the activation patterns of the brain?s language-related areas co-occur with inner speech and can be detected with electroencephalography (EEG). Furthermore, while inner speech doesn?t lead to any discernible voice sound or articulation, co-occurring low amplitude electrical discharges in the articulatory muscles can be detected with electromyography (EMG). The information about ongoing inner speech reflected in electrophysiological signals (EEG and EMG) can be used to transcribe inner speech into text or voice. Machine learning algorithms have been used for this purpose, however, the resulting systems have low accuracy and/or are constrained by very small vocabularies (~10 words). Furthermore, these systems need to be trained anew for each user, which significantly increases individual data-collection time. The development of ready-to-use/minimal-training (fine tuning) systems requires large training datasets that algorithms can use to learn high-level features capable of being transferred between individuals. Unfortunately, to date there are no available datasets that are large enough to train these systems. To tackle these issues, I have assembled a multidisciplinary team of collaborators from Google AI, Yale linguistics, and Yale Psychiatry to develop a state-of-the-art deep neural network to transcribe inner speech to text using EEG and EMG signals. This system will incorporate some of the latest advances in artificial intelligence and data processing developed by Google AI. It will be designed to transcribe phonemes, thus, in principle, will be able to transcribe any word. Furthermore, we will collect the largest (x120 times) multi-subject (n=150) electrophysiological (EEG+EMG) inner speech dataset to date (300 hrs. in total) to train the first ready- to-use/minimal-training inner speech transcriber system. The technology resulting from this study has the potential to radically improve the quality of life of thousands of patients by providing them with a fast method of communicating their verbal thoughts. Furthermore, by combining this system with one of the many text-to-speech AIs that are currently available, our system could potentially restore the patients? capacity to produce conversational speech.

Public Health Relevance

People that have lost their capacity for verbal communication struggle with isolation, mental illness, and poor quality of life. Artificial intelligence offers an opportunity to translate verbal thoughts into text or synthesized voice and restore verbal communication in impaired people. In this study, we introduce a state-of-the art artificial intelligence system designed to transduce the electrophysiological activity (electroencephalography [EEG] and electromyography [EMG]) accompanying verbal thoughts into text.

Agency
National Institute of Health (NIH)
Institute
National Institute of Biomedical Imaging and Bioengineering (NIBIB)
Type
Exploratory/Developmental Grants (R21)
Project #
1R21EB029607-01A1
Application #
10058047
Study Section
Biomedical Computing and Health Informatics Study Section (BCHI)
Program Officer
Shabestari, Behrouz
Project Start
2020-09-15
Project End
2023-09-14
Budget Start
2020-09-15
Budget End
2023-09-14
Support Year
1
Fiscal Year
2020
Total Cost
Indirect Cost
Name
Yale University
Department
Psychiatry
Type
Schools of Medicine
DUNS #
043207562
City
New Haven
State
CT
Country
United States
Zip Code
06520