The current proposal continues the investigation on the topic of temporal relation extraction from the Electronic Medical Records (EMR) clinical narrative funded by the NLM since 2010 (Temporal Histories of Your Medical Events, or THYME; Through our efforts so far, we have defined the topic as an active area of research attracting attention across the world. Since its inception, the project has pushed the boundaries of this highly challenging task by investigating new computational methods within the context of the latest developments in the fields of natural language processing (NLP), machine learning (ML), artificial intelligence (AI) and biomedical informatics (BMI) resulting in 60+ publications/presentations. We have made our best performing methods available to the community open source as part of the Apache Clinical Text Analysis and Knowledge Extraction System (cTAKES; In 2015, 2016, 2017 and 2018, we organized an international shared task (Clinical TempEval) on the topic under the umbrella of the highly prestigious SemEval, thus inviting the international community to work with our THYME data and improve on our results. Clinical TempEval has been highly successful with many participants each year, resulting in new discoveries and many publications. We have made all our data along with our gold standard annotations available to the community through the hNLP Center ( The underlying theme of this renewal is novel methods for combining explicit domain knowledge (linguistic, semantic, biomedical ontological, clinical), readily available unlabeled data (health-related social media, EMRs), and modern machine learning techniques (e.g. neural networks) for temporal relation extraction from the EMR clinical narrative. Therefore, our renewal proposes a novel and much needed exploration of this line of research:
Specific Aim 1 : Develop computational models for novel rich semantic representations such as the Abstract Meaning Representations to encapsulate a single, coherent, full-document graphical representation of meaning for temporal relation extraction Specific Aim 2: Develop computational methods to infuse domain knowledge (linguistic, semantic, biomedical ontological, clinical) into modern machine learning techniques such as NNs for temporal relation extraction ? through input representations, pre-trained vectors, or architectures Specific Aim 3: Develop novel methods for combining labeled and unlabeled data from various sources (EMR, health-related social media, newswire) for temporal relation extraction from the clinical narrative Specific Aim 4: Apply the best performing methods for temporal relation extraction developed in SA1-3 to temporally sensitive phenotypes for direct translational sciences studies. Dissemination efforts through publications and open source releases into Apache cTAKES.

Public Health Relevance

Temporal relations are of prime importance in biomedicine as they are intrinsically linked to diseases, signs and symptoms, and treatments. Understanding the timeline of clinically relevant events is key to the next generation of translational research where the importance of generalizing over large amounts of data holds the promise of deciphering biomedical puzzles. The goal of our current proposal is to automatically discover temporal relations from clinical free text and structured EMR data and create an aggregated patient-level timeline.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZLM1)
Program Officer
Sim, Hua-Chuan
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Boston Children's Hospital
United States
Zip Code
Névéol, Aurélie; Dalianis, Hercules; Velupillai, Sumithra et al. (2018) Clinical Natural Language Processing in languages other than English: opportunities and challenges. J Biomed Semantics 9:12
Gonzalez-Hernandez, G; Sarker, A; O'Connor, K et al. (2017) Capturing the Patient's Perspective: a Review of Advances in Natural Language Processing of Health-Related Text. Yearb Med Inform 26:214-227
Miller, Timothy; Dligach, Dmitriy; Bethard, Steven et al. (2017) Towards generalizable entity-centric clinical coreference resolution. J Biomed Inform 69:251-258
Savova, Guergana K; Tseytlin, Eugene; Finan, Sean et al. (2017) DeepPhe: A Natural Language Processing System for Extracting Cancer Phenotypes from Clinical Records. Cancer Res 77:e115-e118
Lin, Chen; Dligach, Dmitriy; Miller, Timothy A et al. (2016) Multilayered temporal modeling for the clinical domain. J Am Med Inform Assoc 23:387-95
Lin, Chen; Karlson, Elizabeth W; Dligach, Dmitriy et al. (2015) Automatic identification of methotrexate-induced liver toxicity in patients with rheumatoid arthritis from the electronic medical record. J Am Med Inform Assoc 22:e151-61
Pradhan, Sameer; Elhadad, Noémie; South, Brett R et al. (2015) Evaluating the state of the art in disorder recognition and normalization of the clinical narrative. J Am Med Inform Assoc 22:143-54
Luo, Xiaoqiang; Pradhan, Sameer; Recasens, Marta et al. (2014) An Extension of BLANC to System Mentions. Proc Conf Assoc Comput Linguist Meet 2014:24-29
Styler 4th, William F; Bethard, Steven; Finan, Sean et al. (2014) Temporal Annotation in the Clinical Domain. Trans Assoc Comput Linguist 2:143-154
Pfiffner, Pascal B; Oh, JiWon; Miller, Timothy A et al. (2014) as a data source for semi-automated point-of-care trial eligibility screening. PLoS One 9:e111055

Showing the most recent 10 out of 18 publications