Healthcare information is increasingly distributed across many independent databases and systems, both within and among organizations as separate islands with different patient identifiers. This is the case for data collected within an institution where there may be multiple identifiers, and for data collected about the same patient at different health care institutions, different pharmacy systems, different payers, different public health agencies, and so on. This situation hinders the aggregation of information about individuals across such databases as needed for clinical decision support, clinical care, public health reporting, clinical research, and outcomes management. Aggregation is important not only to determine a patient's health care status, but also for clinical effectiveness research, drug safety research and other population-based studies requiring comprehensive data. While HIE's are an increasingly common source of comprehensive clinical, formal recommendations explicitly addressing HIE data aggregation approaches are lacking. Consequently, HIE's currently use a variety of differing data aggregation approaches. Because HIE's represent complex """"""""melting pots"""""""" of heterogeneous clinical information sources with varying data quality and characteristics, they present unique data aggregation challenges and opportunities. Therefore, clear documentation and dissemination of concrete, real-world methods for accurate, efficient, and data aggregation are crucial to developing a robust and reliable National Health Information Network (NHIN). We will formally document and disseminate two distinct, existing classes of linkage methodologies currently used in the context of a long-standing, operational health information exchange. We will implement and evaluate extensions to the probabilistic method that are designed to improve algorithm accuracy. Extensions will include: stochastic and closed-form solutions for parameter estimation methods;generalization of the probabilistic method to accommodate statistical dependence between fields;evaluation of novel nearness comparators and continuous and discrete modifications allowing formal inclusion of comparators. We will evaluate and extend methods for creating synthetic linkage data that closely reflects the statistical characteristics of the underlying. We will evaluate methods that detect the presence or absence of specific data characteristics that inform the selection of extensions to the underlying probabilistic matching model. We will develop and evaluate processes for identifying data element combinations that fail the test for statistical independence. We will evaluate and characterize the technical performance and clinical and operational value of linking real world HIE data sources for a variety of scenarios using both deterministic and probabilistic methods.

Public Health Relevance

We will documentation and disseminate data aggregation processes that are implemented in an operation HIE. We will extend existing matching algorithms using novel modifications. We will evaluate the value of linking data to a variety of clinical sources both within and outside of the Health Information Exchange.

Agency
National Institute of Health (NIH)
Institute
Agency for Healthcare Research and Quality (AHRQ)
Type
Research Project (R01)
Project #
5R01HS018553-03
Application #
8107612
Study Section
Health Care Technology and Decision Science (HTDS)
Program Officer
Burgess, Denise
Project Start
2009-09-30
Project End
2013-07-31
Budget Start
2011-08-01
Budget End
2013-07-31
Support Year
3
Fiscal Year
2011
Total Cost
Indirect Cost
Name
Indiana University-Purdue University at Indianapolis
Department
Internal Medicine/Medicine
Type
Schools of Medicine
DUNS #
603007902
City
Indianapolis
State
IN
Country
United States
Zip Code
46202
Daggy, Joanne; Xu, Huiping; Hui, Siu et al. (2014) Evaluating latent class models with conditional dependence in record linkage. Stat Med 33:4250-65
Xu, Huiping; Hui, Siu L; Grannis, Shaun (2014) Optimal two-phase sampling design for comparing accuracies of two binary classification rules. Stat Med 33:500-13
Li, Xiaochun; Shen, Changyu (2013) Linkage of patient records from disparate sources. Stat Methods Med Res 22:31-8