Improving missing data analysis in distributed research networks

Toh, Darren

Abstract

Electronic health record (EHR) databases collect data that reflect routine clinical care. These databases are increasingly used in comparative effectiveness research, patient-centered outcomes research, quality improvement assessment, and public health surveillance to generate actionable evidence that improves patient care. It is often necessary to analyze multiple databases that cover large and diverse populations to improve the statistical power of the study or generalizability of the findings. A common approach to analyzing multiple databases is the use of a distributed research network (DRN) architecture, in which data remains under the physical control of data partners. Although EHRs are generally thought to contain rich clinical information, the information is not uniformly collected. Certain information is available only for some patients, and only at some time points for a given patient. There are generally two types of missing information in EHRs. The first is the conventionally understood and obvious missing data in which some data fields (e.g., body mass index) are not complete for various reasons, e.g., the clinician does not collect the information or the patient chooses not to provide the information. The second is less obvious because the data field is not empty but the recorded value may be incorrect due to missing data. For example, EHRs generally do not have complete data for care that occurs in a different delivery system. A medical condition (e.g., asthma) may be coded as ?no? but the true value would have been ?yes? if more complete data had been available, e.g., from claims data as the other delivery system would submit a claim to the patient?s health plan for the care provided. In other words, one may incorrectly treat ?absence of evidence? as ?evidence of absence?. EHRs hold great promise but we must address several outstanding methodological challenges inherent in the databases, specifically missing data. Addressing missing data is more challenging in DRNs due to different missing data mechanisms across databases.
The specific aims of the study are: (1) Apply and assess missing data methods developed in single-database settings to handle obvious and well-recognized missing data in DRNs; (2) Apply and assess machine learning and predictive modeling techniques to address less obvious and under-recognized missing data for select variables in DRNs; and (3) Apply and assess a comprehensive analytic approach that combines conventional missing data methods and machine learning techniques to address missing data in DRNs. The analytic methods developed in this project, including the extension of existing missing data methods to DRNs, the innovative use of machine learning techniques to address missing data, and their integration with privacy- protecting analytic methods, will have direct impact on the design and analysis of future comparative effectiveness and safety studies, and patient-centered outcomes research conducted in DRNs.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: Agency for Healthcare Research and Quality (AHRQ)
Type: Research Project (R01)
Project #: 5R01HS026214-03
Application #: 10007887
Study Section: Healthcare Information Technology Research (HITR)
Program Officer: Wyatt, Derrick

Project Start: 2018-09-30
Project End: 2021-09-29
Budget Start: 2020-09-30
Budget End: 2021-09-29
Support Year: 3
Fiscal Year: 2020
Total Cost
Indirect Cost

Institution

Name: Harvard Pilgrim Health Care, Inc.
Department
Type
DUNS #: 071721088

City: Boston
State: MA
Country: United States
Zip Code: 02215

Related projects


NIH 2020 R01 HS	Improving missing data analysis in distributed research networks Toh, Darren / Harvard Pilgrim Health Care, Inc.
NIH 2019 R01 HS	Improving missing data analysis in distributed research networks Toh, Darren / Harvard Pilgrim Health Care, Inc.
NIH 2018 R01 HS	Improving missing data analysis in distributed research networks Toh, Darren / Harvard Pilgrim Health Care, Inc.

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Related projects

Comments