The overarching long-term vision of our research is to create novel technologies for processing clinical free text. We will build upon the previous work of our ongoing project Temporal relation discovery for clinical text (R01LM010090) dubbed Temporal Histories of Your Medical Events (THYME; which has been focusing on methodology for event, temporal expressions and temporal relations discovery from the clinical text residing in the Electronic Health Records (EHR). We developed a comprehensive approach to temporality in the clinical text and innovated in computable temporal representations, methods for temporal relation discovery and their evaluation, rendering temporality to end users - resulting in over 35+ papers and presentations. Our dissemination is international and far-reaching as the best performing methods are released open source as part of the Apache Clinical Text Analysis and Knowledge Extraction System ( The methods we developed are now being used in such nation-wide initiatives as the Electronic Medical Records and Genomics (eMERGE), Pharmacogenomics Network (PGRN), Informatics for Integrating the Biology and the Bedside (i2b2), Patient Centered Outcomes Research Institute and National Cancer Institute's Informatics Technology for Cancer Research (ITCR). Through our participation in organizing major international bakeoffs - CLEF/ShARe 2014, SemEval 2014 Analysis of Clinical Text Task 7, SemEval 2015 Analysis of Clinical Text Task 14, SemEval 2015 Clinical TempEval Task 6 - we further disseminated the THYME resources and challenged the international research community to explore new solutions to the unsolved temporality task. Through all these activities it became clear that computational approaches to temporality still present great challenges and usability of the output is still limited. Therefore, we propose to further innovate on methodologies and end user experience.
Specific Aim 1 : Extract enhanced representations and novel features to support deriving timeline information.
Specific Aim 2 : Develop methods to amalgamate individual patient episode timelines into an aggregate patient-level timeline.
Specific Aim 3 : Mine the EHR - the unstructured clinical text and the structured codified information - for full patient-level temporality.
Specific Aim 4 : Develop a comprehensive temporal visualization tool Specific Aim 5: Develop methodology for and perform extrinsic evaluation on specific use case.
Specific Aim 6 : (1) Evaluate state-of-the-art of temporal relations through organizing international challenges under the auspices of SemEval, (2) Disseminate the results through publications, presentations, and open source code in Apache cTAKES. Functional testing.

Public Health Relevance

Temporal relations are of prime importance in biomedicine as they are intrinsically linked to diseases, signs and symptoms, and treatments. Understanding the timeline of clinically relevant events is key to the next generation of translational research where the importance of generalizing over large amounts of data holds the promise of deciphering biomedical puzzles. The goal of our current proposal is to automatically discover temporal relations from clinical free text and structured EHR data and create an aggregated patient-level timeline.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZLM1)
Program Officer
Sim, Hua-Chuan
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Children's Hospital Boston
United States
Zip Code
Névéol, Aurélie; Dalianis, Hercules; Velupillai, Sumithra et al. (2018) Clinical Natural Language Processing in languages other than English: opportunities and challenges. J Biomed Semantics 9:12
Gonzalez-Hernandez, G; Sarker, A; O'Connor, K et al. (2017) Capturing the Patient's Perspective: a Review of Advances in Natural Language Processing of Health-Related Text. Yearb Med Inform 26:214-227
Miller, Timothy; Dligach, Dmitriy; Bethard, Steven et al. (2017) Towards generalizable entity-centric clinical coreference resolution. J Biomed Inform 69:251-258
Savova, Guergana K; Tseytlin, Eugene; Finan, Sean et al. (2017) DeepPhe: A Natural Language Processing System for Extracting Cancer Phenotypes from Clinical Records. Cancer Res 77:e115-e118
Lin, Chen; Dligach, Dmitriy; Miller, Timothy A et al. (2016) Multilayered temporal modeling for the clinical domain. J Am Med Inform Assoc 23:387-95
Lin, Chen; Karlson, Elizabeth W; Dligach, Dmitriy et al. (2015) Automatic identification of methotrexate-induced liver toxicity in patients with rheumatoid arthritis from the electronic medical record. J Am Med Inform Assoc 22:e151-61
Pradhan, Sameer; Elhadad, Noémie; South, Brett R et al. (2015) Evaluating the state of the art in disorder recognition and normalization of the clinical narrative. J Am Med Inform Assoc 22:143-54
Luo, Xiaoqiang; Pradhan, Sameer; Recasens, Marta et al. (2014) An Extension of BLANC to System Mentions. Proc Conf Assoc Comput Linguist Meet 2014:24-29
Styler 4th, William F; Bethard, Steven; Finan, Sean et al. (2014) Temporal Annotation in the Clinical Domain. Trans Assoc Comput Linguist 2:143-154
Pfiffner, Pascal B; Oh, JiWon; Miller, Timothy A et al. (2014) as a data source for semi-automated point-of-care trial eligibility screening. PLoS One 9:e111055

Showing the most recent 10 out of 18 publications