Epidemiologic analyses of health care data can provide critical evidence on the effectiveness and safety of therapeutics. This is particularly vital during the transition from the point of regulatory approval through the early marketing of new drugs, a time when physicians, regulators and payers are all struggling with incomplete data. Health plans pay for these drugs without knowing how their effectiveness and safety compares with established alternatives, as new compounds are tested against placebos rather than active agents, and tested only in selected patients. Non-randomized studies in large healthcare databases can provide fast and less costly evidence on drug effects. However, conventional adjustment methods that rely on a small number of investigator-specified confounders often fail and may produce biased results. We propose and have preliminary evidence that employing modern medical informatics algorithms that structure and search databases to empirically identify thousands of new covariates. These will then enter established propensity score-based models and so make far more effective use of the information contained in health care databases and electronic medical records (EMRs), resulting in more valid causal interpretations of treatment effects. We will: - Develop algorithms that make greater use of information contained in longitudinal claims and EMR databases by empirically identifying thousands of potential confounders. The performance of these approaches will be evaluated in 6 example studies encompassing recent drug safety and comparative effectiveness problems, and will be implemented in multiple large claims databases supplemented by such data as lab values and EMR information in subgroups. -- Develop novel methods for confounding adjustment based on textual information found in EMRs. -- Expand the newly developed mining algorithms into a framework that integrates distributed database networks with uneven information content, similar to the Sentinel Network recently initiated by FDA. This project is likely to produce groundbreaking results at the interface of medicine, biomedical informatics, and epidemiologic methods. After completion of this project a library of documented and validated algorithms will be available to significantly improve confounder control in a range of healthcare databases. The theoretical foundation and the ready-to-use algorithms will likely lead to a fundamental shift in how databases contribute to the fast and accurate assessment of newly-marketed medications.

Public Health Relevance

Large healthcare databases are used to assess the safety and effectiveness of drugs. However, conventional adjustment methods that rely on a limited number of investigator-specific covariates often fail to produce unbiased results. We will develop algorithms that make greater use of information contained in longitudinal claims data and electronic medical records databases by empirically identifying thousands of potential confounders. This will result in improved causal inference on the comparative safety and effectiveness of newly marketed medications that is both less susceptible to investigator omissions and faster than conventional approaches.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Sim, Hua-Chuan
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Brigham and Women's Hospital
United States
Zip Code
Bohn, Justin; Eddings, Wesley; Schneeweiss, Sebastian (2017) Conducting Privacy-Preserving Multivariable Propensity Score Analysis When Patient Covariate Information Is Stored in Separate Locations. Am J Epidemiol 185:501-510
Gagne, Joshua J; Wang, Shirley V; Rassen, Jeremy A et al. (2014) A modular, prospective, semi-automated drug safety monitoring system for use in a distributed data environment. Pharmacoepidemiol Drug Saf 23:619-27
Gagne, Joshua J; Rassen, Jeremy A; Choudhry, Niteesh K et al. (2014) Near-real-time monitoring of new drugs: an application comparing prasugrel versus clopidogrel. Drug Saf 37:151-61
Rassen, Jeremy A; Shelat, Abhi A; Franklin, Jessica M et al. (2013) Matching by propensity score in cohort studies with three treatment groups. Epidemiology 24:401-9
Garbe, E; Kloss, S; Suling, M et al. (2013) High-dimensional versus conventional propensity scores in a comparative effectiveness study of coxibs and reduced upper gastrointestinal complications. Eur J Clin Pharmacol 69:549-57
Juillerat, P; Schneeweiss, S; Cook, E F et al. (2012) Drugs that inhibit gastric acid secretion may alter the course of inflammatory bowel disease. Aliment Pharmacol Ther 36:239-47
Schneeweiss, Sebastian; Rassen, Jeremy A; Glynn, Robert J et al. (2012) Supplementing claims data with outpatient laboratory test results to improve confounding adjustment in effectiveness studies of lipid-lowering treatments. BMC Med Res Methodol 12:180
Polinski, Jennifer M; Schneeweiss, Sebastian; Glynn, Robert J et al. (2012) Confronting ""confounding by health system use"" in Medicare Part D: comparative effectiveness of propensity score approaches to confounding adjustment. Pharmacoepidemiol Drug Saf 21 Suppl 2:90-8
Schneeweiss, Sebastian; Gagne, Joshua J; Patrick, Amanda R et al. (2012) Comparative efficacy and safety of new oral anticoagulants in patients with atrial fibrillation. Circ Cardiovasc Qual Outcomes 5:480-6
Gagne, J J; Glynn, R J; Rassen, J A et al. (2012) Active safety monitoring of newly marketed medications in a distributed data network: application of a semi-automated monitoring system. Clin Pharmacol Ther 92:80-6

Showing the most recent 10 out of 25 publications