Deep phenotyping in Electronic Health Records for Genomic Medicine

Weng, Chunhua; Wang, Kai

Abstract

The overarching goal of the project is to establish a genomic medicine learning system to accelerate genomic knowledge discovery and application in electronic health records (EHRs). We will integrate deep characteristic phenotypes extracted from EHRs and evolving knowledge of genotype-phenotype associations to optimize the accuracy of variant interpretation and the cost-effectiveness of clinical genome/exome sequencing, and to accelerate the discovery of causal genes by constructing a dynamic genotype-phenotype knowledge network. Prior knowledge on phenotype-gene relationships and phenotypic information about patients can facilitate the identification of disease-causing mutations from thousands of genetic variants in the context of clinical genomic sequencing; however, how best to abstract phenotype information from notes in the EHRs of patients who are diagnosed with or evaluated for monogenetic disorders, standardize the computable representation of phenotypes, and utilize it in genomic interpretation remains unclear. Additionally, how to systematically compare phenotypes across diseases to discover new knowledge in human genetics remains a largely untapped area with great promise. To address these challenges, we will develop and validate scalable and portable open-source natural language processing (NLP) methods for automated and accurate abstraction of characteristic phenotype concepts (e.g., ?j-shaped sella turcica? and ?short stature?) from EHR narratives. We will then develop a phenotype-driven scoring system called EHR-Phenolyzer to predict the likely candidate genetic variants associated with the phenotypes for patients with genomic sequencing and a high probability of a monogenic condition. On this basis, we will develop a probabilistic disease diagnosis and knowledge discovery system using rich and deep EHR phenotypes, and evaluate these methods for genomic diagnosis and discovery using large- scale clinical exome sequencing data. Ultimately, these methods will support efficient, effective, and scalable genomic diagnostics, and facilitate the implementation of genome-guided precision medicine in clinical practice.

Public Health Relevance

We will develop novel informatics methods to abstract characteristic phenotypes from electronic health records (EHRs) for patients diagnosed with or evaluated for monogenetic disorders, enable the interoperability of computable characteristic phenotypes with existing phenotype-genotype association knowledge such as OMIM and ClinVar, and improve the efficiency and effectiveness of genomic diagnostics.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Research Project (R01)
Project #: 5R01LM012895-03
Application #: 9925808
Study Section: Biomedical Library and Informatics Review Committee (BLR)
Program Officer: Sim, Hua-Chuan

Project Start: 2018-09-17
Project End: 2022-05-31
Budget Start: 2020-06-01
Budget End: 2021-05-31
Support Year: 3
Fiscal Year: 2020
Total Cost
Indirect Cost

Institution

Name: Columbia University (N.Y.)
Department: Internal Medicine/Medicine
Type: Schools of Medicine
DUNS #: 621889815

City: New York
State: NY
Country: United States
Zip Code: 10032

Related projects


NIH 2020 R01 LM	Deep phenotyping in Electronic Health Records for Genomic Medicine Weng, Chunhua; Wang, Kai / Columbia University (N.Y.)
NIH 2020 R01 LM	Deep phenotyping in Electronic Health Records for Genomic Medicine Weng, Chunhua; Wang, Kai / Columbia University (N.Y.)
NIH 2019 R01 LM	Deep phenotyping in Electronic Health Records for Genomic Medicine Weng, Chunhua; Wang, Kai / Columbia University (N.Y.)
NIH 2018 R01 LM	Deep phenotyping in Electronic Health Records for Genomic Medicine Weng, Chunhua; Wang, Kai / Columbia University (N.Y.)

Comments

Be the first to comment on Chunhua Weng's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: