From enrichment to insights

Shah, Nigam

Abstract

Most medical decisions are made without the support of rigorous evidence in large part due to the cost and complexity of performing randomized trials for most clinical situations. In practice, clinicians must use their judgement, informed by their own and the collective experience of their colleagues. The advent of the electronic health record (EHR) enables the modern practitioner to algorithmically check the records of thousands or millions of patients to rapidly find similar cases and compare outcomes. In addition to filling the inferential gap in actionable evidence, these kinds of analyses avoid issues of ethics, practicality, and generalizability that plague randomized clinical trials (RCTs). Unfortunately, identifying patients with the appropriate phenotypes, properly leveraging available data to adjust results, and matching similar patients to reduce confounding remain critical challenges in every study that uses EHR data. Overcoming these challenges to improve the accuracy of observational studies conducted with EHR data is of paramount importance. Studies using EHR data begin by defining a set of patients with specific phenotypes, analogous to amassing a cohort for a clinical trial. This process of electronic phenotyping, is typically done via a set of rules defined by experts. Machine learning approaches are increasingly used to complement consensus definitions created by experts and we propose several advances to validate and improve this practice. We will explore and quantify the effects of feature engineering choices to transform the diagnoses, procedures, medications, laboratory tests and clinical notes in the EHR into a computable feature matrix. Finally, building on recent advances, we plan to characterize the performance of existing methods and develop EHR-specific strategies for patient matching. Our work is significant because we will take on three challenging problems--electronic phenotyping, feature engineering, and patient matching--that stand in the way of generating insights via EHR data. If we are successful, we will significantly advance our ability to generate insights from the large amounts of health data that are routinely generated as a byproduct of clinical processes.

Public Health Relevance

The advent of the electronic health record (EHR) enables the search of thousands or millions of patients to rapidly find similar cases and compare outcomes. We will develop methods for feature engineering, electronic phenotyping and patient matching from real-world EHR data. If we are successful, we will significantly advance our ability to generate insights from the large amounts of health data that are routinely generated as a byproduct of clinical processes.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Research Project (R01)
Project #: 2R01LM011369-05
Application #: 9365759
Study Section: Biomedical Library and Informatics Review Committee (BLR)
Program Officer: Ye, Jane

Project Start: 2013-09-01
Project End: 2021-08-31
Budget Start: 2017-09-01
Budget End: 2018-08-31
Support Year: 5
Fiscal Year: 2017
Total Cost
Indirect Cost

Institution

Name: Stanford University
Department: Internal Medicine/Medicine
Type: Schools of Medicine
DUNS #: 009214214

City: Stanford
State: CA
Country: United States
Zip Code: 94304

Related projects


NIH 2020 R01 LM	From enrichment to insights Shah, Nigam / Stanford University
NIH 2019 R01 LM	From enrichment to insights Shah, Nigam / Stanford University
NIH 2018 R01 LM	From enrichment to insights Shah, Nigam / Stanford University
NIH 2017 R01 LM	From enrichment to insights Shah, Nigam / Stanford University
NIH 2016 R01 LM	Methods for generalized ontology terms enrichment analysis Shah, Nigam / Stanford University
NIH 2015 R01 LM	Methods for generalized ontology terms enrichment analysis Shah, Nigam / Stanford University
NIH 2014 R01 LM	Methods for generalized ontology terms enrichment analysis Shah, Nigam / Stanford University
NIH 2013 R01 LM	Methods for generalized ontology terms enrichment analysis Shah, Nigam / Stanford University	$477,224

Publications

Callahan, Alison; Winnenburg, Rainer; Shah, Nigam H (2018) U-Index, a dataset and an impact metric for informatics tools and databases. Sci Data 5:180043

Coulet, Adrien; Shah, Nigam H; Wack, Maxime et al. (2018) Predicting the need for a reduced drug dose, at first prescription. Sci Rep 8:15558

Wang, Liwei; Rastegar-Mojarad, Majid; Ji, Zhiliang et al. (2018) Detecting Pharmacovigilance Signals Combining Electronic Medical Records With Spontaneous Reports: A Case Study of Conventional Disease-Modifying Antirheumatic Drugs for Rheumatoid Arthritis. Front Pharmacol 9:875

Agarwal, Vibhu; Shah, Nigam H (2017) LEARNING ATTRIBUTES OF DISEASE PROGRESSION FROM TRAJECTORIES OF SPARSE LAB VALUES. Pac Symp Biocomput 22:184-194

Ravikumar, K E; Rastegar-Mojarad, Majid; Liu, Hongfang (2017) BELMiner: adapting a rule-based relation extraction system to extract biological expression language statements from bio-medical literature evidence sentences. Database (Oxford) 2017:

Oellrich, Anika; Collier, Nigel; Groza, Tudor et al. (2016) The digital revolution in phenotyping. Brief Bioinform 17:819-30

Li, Dingcheng; Wang, Zhen; Wang, Liwei et al. (2016) A Text-Mining Framework for Supporting Systematic Reviews. Am J Inf Manag 1:1-9

Agarwal, Vibhu; Podchiyska, Tanya; Banda, Juan M et al. (2016) Learning statistical models of phenotypes using noisy labeled training data. J Am Med Inform Assoc 23:1166-1173

Hripcsak, George; Ryan, Patrick B; Duke, Jon D et al. (2016) Characterizing treatment pathways at scale using the OHDSI network. Proc Natl Acad Sci U S A 113:7329-36

Rastegar-Mojarad, Majid; Komandur Elayavilli, Ravikumar; Liu, Hongfang (2016) BELTracker: evidence sentence retrieval for BEL statements. Database (Oxford) 2016:

Showing the most recent 10 out of 36 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: