Cancer Deep Phenotype Extraction from Electronic Medical Records

Savova, Guergana

Abstract

Precise phenotype information is needed to advance translational cancer research, particularly to unravel the effects of genetic, epigenetic, and other factors on tumor behavior and responsiveness. Examples of phenotypic variables in cancer include: tumor morphology (e.g. histopathologic diagnosis), co-morbid conditions (e.g. associated immune disease), laboratory findings (e.g. gene amplification status), specific tumor behaviors (e.g. metastasis) and response to treatment (e.g. effect of a chemotherapeutic agent on tumor). Current models for correlating EMR data with -omics data largely ignore the clinical text, which remains one of the most important sources of phenotype information for cancer patients. Unlocking the value of clinical text has the potential to enable new insights about cancer initiation, progression, metastasis, and response to treatment. We propose further collaboration of two mature informatics groups with long histories of developing open-source natural language processing (NLP) software (Apache cTAKES, caTIES and ODIE) to extend existing software with new methods for cancer deep phenotyping. Several aims propose investigation of biomedical information extraction where there has been little or no previous work (e.g. clinical genomic entities, and causal discourse). Visualization of extracted data, usability of the software, and dissemination are also emphasized. Three driving oncology projects led by accomplished translational investigators in Breast Cancer, Melanoma, and Ovarian Cancer will drive development of the software. These labs will contribute phenotype variables for extraction, test utility and usability of the software, and provide the setting for a extrinsic evaluation. The proposed research bridges novel methods to automate cancer deep phenotype extraction from clinical text with emerging standards in phenotype knowledge representation and NLP. This work is highly aligned with recent calls in the scientific literature to advance scalable and robust methods of extracting and representing phenotypes for precision medicine and translational research.

Public Health Relevance

We propose research to enhance the ability of researchers to utilize data from unstructured medical records in their translational cancer research programs. The proposed software platform has the ability to enhance the health of the public by contributing new methods for advancing cancer research.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Cancer Institute (NCI)
Type: Resource-Related Research Projects--Cooperative Agreements (U24)
Project #: 5U24CA184407-06
Application #: 9477487
Study Section: Special Emphasis Panel (ZCA1)
Program Officer: Mariotto, Angela B

Project Start: 2014-05-06
Project End: 2019-04-30
Budget Start: 2018-05-01
Budget End: 2019-04-30
Support Year: 6
Fiscal Year: 2018
Total Cost
Indirect Cost

Institution

Name: Boston Children's Hospital
Department
Type
DUNS #: 076593722

City: Boston
State: MA
Country: United States
Zip Code

Related projects


NIH 2018 U24 CA	Cancer Deep Phenotype Extraction from Electronic Medical Records Savova, Guergana K. / Boston Children's Hospital
NIH 2017 U24 CA	Cancer Deep Phenotype Extraction from Electronic Medical Records Jacobson, Rebecca S.; Savova, Guergana K. / University of Pittsburgh
NIH 2017 U24 CA	Cancer Deep Phenotype Extraction from Electronic Medical Records Savova, Guergana K. / Boston Children's Hospital
NIH 2016 U24 CA	Cancer Deep Phenotype Extraction from Electronic Medical Records Jacobson, Rebecca S.; Savova, Guergana K. / University of Pittsburgh
NIH 2015 U24 CA	Cancer Deep Phenotype Extraction from Electronic Medical Records Jacobson, Rebecca S.; Savova, Guergana K. / University of Pittsburgh
NIH 2014 U24 CA	Cancer Deep Phenotype Extraction from Electronic Medical Records Crowley, Rebecca S.; Savova, Guergana K. / University of Pittsburgh	$696,805

Publications

Malty, Andrew M; Jain, Sandeep K; Yang, Peter C et al. (2018) Computerized Approach to Creating a Systematic Ontology of Hematology/Oncology Regimens. JCO Clin Cancer Inform 2018:

Névéol, Aurélie; Dalianis, Hercules; Velupillai, Sumithra et al. (2018) Clinical Natural Language Processing in languages other than English: opportunities and challenges. J Biomed Semantics 9:12

Gonzalez-Hernandez, G; Sarker, A; O'Connor, K et al. (2017) Capturing the Patient's Perspective: a Review of Advances in Natural Language Processing of Health-Related Text. Yearb Med Inform 26:214-227

Castro, Sergio M; Tseytlin, Eugene; Medvedeva, Olga et al. (2017) Automated annotation and classification of BI-RADS assessment from radiology reports. J Biomed Inform 69:177-187

Miller, Timothy; Dligach, Dmitriy; Bethard, Steven et al. (2017) Towards generalizable entity-centric clinical coreference resolution. J Biomed Inform 69:251-258

Savova, Guergana K; Tseytlin, Eugene; Finan, Sean et al. (2017) DeepPhe: A Natural Language Processing System for Extracting Cancer Phenotypes from Clinical Records. Cancer Res 77:e115-e118

Hochheiser, Harry; Castine, Melissa; Harris, David et al. (2016) An information model for computable cancer phenotypes. BMC Med Inform Decis Mak 16:121

Lin, Chen; Dligach, Dmitriy; Miller, Timothy A et al. (2016) Multilayered temporal modeling for the clinical domain. J Am Med Inform Assoc 23:387-95

Dligach, Dmitriy; Miller, Timothy; Savova, Guergana K (2015) Semi-supervised Learning for Phenotyping Tasks. AMIA Annu Symp Proc 2015:502-11

Comments

Be the first to comment on Guergana Savova's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: