The Food and Drug Administration Amendments Act (FDAAA) requires FDA to obtain access to multiple distributed large claims and electronic medical records databases covering 100 million lives by 2012. The pooled database will provide sufficient size to study extremely rare events and a platform for extensive comparative safety and effectiveness research with routine care data. This will include patients underrepresented in many trials, including aging populations with multiple morbidities. Data privacy requirements will limit sharing of detailed patient information, although such information is critical for multivariate adjustment of confounders and valid causal inference. Another major unsolved issue is how to combine heterogeneous information content from various health care databases in order to maximize confounding control. This proposal seeks to advance methods for pooling heterogeneous individuallevel electronic healthcare data in order to allow for pooled analyses with full multivariate adjustment with no sharing of private patient data. --- We will develop and test methods and algorithms for pooling both like and heterogeneous data elements, combining claims information from multiple health care providers and then augmenting these claims with electronic medical record and laboratory values. The methods will be neutral with respect to coding standards and will be robust to heterogeneous database structures. --- We will use the methods to perform three example studies, each of which requires pooled data due to infrequent exposure, small patient subgroups, rare outcomes, or a combination of these: (1) effectiveness of high- versus low-potency statin use after acute coronary syndrome with respect to myocardial infarction and cardiovascular death;(2) effectiveness of TNF inhibitors in patients with rheumatoid arthritis with respect to reducing pain medication use and improvement in lab values and X-ray diagnostics;(3) reduced effectiveness of clopidogrel in the presence of proton pump inhibitors in patients with acute coronary syndrome and/or percutaneous coronary intervention, potentially leading to increased re-infarction rates and death versus clopidogrel alone. --- We will explore, both statistically and operationally, when pooling of individual-level data will out-perform aggregate-level meta-analysis, and test the hypothesis that aggregate-level meta-analysis will yield substantially similar point estimates in most scenarios. --- We will publish and provide SAS code for all methods developed, and provide training sessions to relevant groups of researchers in order to broaden the scope and ensure lasting impact of the work performed. This 2-year project will significantly advance methodology for pooling individual-level information from diverse health care databases. The work will allow for comparative effectiveness and safety analyses that is of highpriority for payers (Medicare) and regulators (FDA) and will provide multivariate-adjusted results with no threat to patient privacy. The focus is on broad and expedited practical applicability.
FDA's Sentinel post-marketing initiative and Medicare's drug coverage have heightened the need and potential to develop new methods to combine multiple healthcare databases to carry out valid comparative effectiveness and drug safety analyses in routine care. We propose to develop and test techniques for pooling data from heterogeneous data sources in a manner that will allow for full multivariate adjustment and will share no private patient information. This work will help overcome a so far unsolved major issue in data pooling, and will facilitate comparative effectiveness and safety analyses that are of high priority for payers (Medicare) and regulators (FDA).