Visualizing healthcare system dynamics in biomedical Big Data

Weber, Griffin

Abstract

Electronic health records (EHR) and administrative claims databases are transforming medical research by giving investigators access to data on millions of individual patients. Compared to manual paper chart review, these databases reduce the time and cost of clinical studies by orders of magnitude, enabling types of research that were unfeasible in the past. However, investigators often incorrectly treat EHR and claims data as simply big versions of clinical trials data. Yet, there are important differences: During clinicl trials, patient information is obtained and recorded in a standardized way and checked for accuracy and completeness. In contrast, EHR and claims are observational databases, which reflect not only the health of the patients, but also their interactions with the healthcare system For example, the date associated with a code for diabetes is when the physician made the diagnosis, not when the patient first developed the disease. These observations are influenced by the dynamics of the healthcare system-when physicians schedule visits with their patients, which tests physicians decide to order, what codes need to be recorded to get reimbursed for procedures, etc. By ignoring this dimension of the data or naively treating it as noise, investigators risk both misinterpreting the true patient pathophysiology and losing valuable information content. In prior work we showed that analysis of the healthcare system dynamics (HSD) dimension of observational databases can actually be more useful than the patient pathophysiology in predicting survival, selecting matched control cohorts, identifying healthy patients, and defining normal ranges of laboratory tests. Yet, conveying the concept of HSD to researchers and helping them use it effectively is difficult. Therefore, focusing on the topic area of Data Visualization, this proposal addresses this challenge of separating healthcare system dynamics from pathophysiology in observational databases, so that Big Data researchers can use both dimensions to generate new knowledge about patient health. To do this, we bring together informatics and data visualization experts who developed two widely adopted open source software platforms for querying clinical data repositories (Informatics for Integrating Biology and the Bedside, i2b2) and developing modular data analysis and visualization tools (Science of Science, Sci2). We will leverage these systems to perform three Specific Aims: (1) Create an extensible ontology for visualizing the HSD dimensions of biomedical Big Data. (2) Develop a prototype interactive visualization to enable investigators to study HSD in Big Data. The visualization will be simple and familiar to investigators, but innovative in that for the firs time HSD will be treated as its own informative component of the data. By literally placing HSD on its own dimension, the visualization will show investigators its value and teach them how to use it for research. (3) Demonstrate and evaluate the visualizations using three sources of biomedical Big Data: EHR data from two hospital systems in Boston with a total of 7 million patients and nationwide claims data from Aetna health insurance with 34 million patients.

Public Health Relevance

Biomedical Big Data, such as electronic health records (EHR) and administrative claims are records of patients' interactions with the healthcare system; for example, the date of a diagnosis is when a physician entered the code into the EHR, not when the patient developed the disease. Most researchers are either unaware of the distinction or naively treat it as noise. However, the proposed research will show, using a novel Data Visualization, that these subtle effects of the healthcare system on observational clinical data actually contain valuable information that could benefit biomedical research, clinical care, and health care policy.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Cancer Institute (NCI)
Type: Research Project--Cooperative Agreements (U01)
Project #: 3U01CA198934-02S1
Application #: 9236420
Study Section: Special Emphasis Panel (ZRG1 (50)R)
Program Officer: Miller, David J

Project Start: 2015-06-01
Project End: 2018-05-31
Budget Start: 2016-06-01
Budget End: 2017-05-31
Support Year: 2
Fiscal Year: 2016
Total Cost: $414,604
Indirect Cost: $170,000

Institution

Name: Harvard Medical School
Department: Miscellaneous
Type: Schools of Medicine
DUNS #: 047006379

City: Boston
State: MA
Country: United States
Zip Code: 02115

Related projects


NIH 2017 U01 CA	Visualizing healthcare system dynamics in biomedical Big Data Weber, Griffin M. / Harvard Medical School	$464,822
NIH 2016 U01 CA	Visualizing healthcare system dynamics in biomedical Big Data Weber, Griffin M. / Harvard Medical School
NIH 2016 U01 CA	Visualizing healthcare system dynamics in biomedical Big Data Weber, Griffin M. / Harvard Medical School	$414,604
NIH 2015 U01 CA	Visualizing healthcare system dynamics in biomedical Big Data Weber, Griffin M. / Harvard Medical School

Publications

Shiffrin, Richard M; Börner, Katy; Stigler, Stephen M (2018) Scientific progress despite irreproducibility: A seeming paradox. Proc Natl Acad Sci U S A 115:2632-2639

Fortunato, Santo; Bergstrom, Carl T; Börner, Katy et al. (2018) Science of science. Science 359:

Agniel, Denis; Kohane, Isaac S; Weber, Griffin M (2018) Biases in electronic health record data due to processes within the healthcare system: retrospective observational study. BMJ 361:k1479

Staudt, Joseph; Yu, Huifeng; Light, Robert P et al. (2018) High-impact and transformative science (HITS) metrics: Definition, exemplification, and comparison. PLoS One 13:e0200597

Azoulay, Pierre; Graff-Zivin, Joshua; Uzzi, Brian et al. (2018) Toward a more scientific science. Science 361:1194-1197

Carpenter, Janet S; Laine, Tei; Harrison, Blake et al. (2017) Topical, geospatial, and temporal diffusion of the 2015 North American Menopause Society position statement on nonhormonal management of vasomotor symptoms. Menopause 24:1154-1159

Knepper, Richard; Börner, Katy (2016) Comparing the Consumption of CPU Hours with Scientific Output for the Extreme Science and Engineering Discovery Environment (XSEDE). PLoS One 11:e0157628

Comments

Be the first to comment on Griffin Weber's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: