Epidemiologic analyses of health care data can provide critical evidence on the effectiveness and safety of therapeutics. This is particularly vital during the transition from the point of regulatory approval through the early marketing of new drugs, a time when physicians, regulators and payers are all struggling with incomplete data. Health plans pay for these drugs without knowing how their effectiveness and safety compares with established alternatives, as new compounds are tested against placebos rather than active agents, and tested only in selected patients. Non-randomized studies in large healthcare databases can provide fast and less costly evidence on drug effects. However, conventional adjustment methods that rely on a small number of investigator-specified confounders often fail and may produce biased results. We propose and have preliminary evidence that employing modern medical informatics algorithms that structure and search databases to empirically identify thousands of new covariates. These will then enter established propensity score-based models and so make far more effective use of the information contained in health care databases and electronic medical records (EMRs), resulting in more valid causal interpretations of treatment effects. We will: - Develop algorithms that make greater use of information contained in longitudinal claims and EMR databases by empirically identifying thousands of potential confounders. The performance of these approaches will be evaluated in 6 example studies encompassing recent drug safety and comparative effectiveness problems, and will be implemented in multiple large claims databases supplemented by such data as lab values and EMR information in subgroups. -- Develop novel methods for confounding adjustment based on textual information found in EMRs. -- Expand the newly developed mining algorithms into a framework that integrates distributed database networks with uneven information content, similar to the Sentinel Network recently initiated by FDA. This project is likely to produce groundbreaking results at the interface of medicine, biomedical informatics, and epidemiologic methods. After completion of this project a library of documented and validated algorithms will be available to significantly improve confounder control in a range of healthcare databases. The theoretical foundation and the ready-to-use algorithms will likely lead to a fundamental shift in how databases contribute to the fast and accurate assessment of newly-marketed medications.

Public Health Relevance

Large healthcare databases are used to assess the safety and effectiveness of drugs. However, conventional adjustment methods that rely on a limited number of investigator-specific covariates often fail to produce unbiased results. We will develop algorithms that make greater use of information contained in longitudinal claims data and electronic medical records databases by empirically identifying thousands of potential confounders. This will result in improved causal inference on the comparative safety and effectiveness of newly marketed medications that is both less susceptible to investigator omissions and faster than conventional approaches.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Sim, Hua-Chuan
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Brigham and Women's Hospital
United States
Zip Code
Gagne, Joshua J; Wang, Shirley V; Rassen, Jeremy A et al. (2014) A modular, prospective, semi-automated drug safety monitoring system for use in a distributed data environment. Pharmacoepidemiol Drug Saf 23:619-27
Gagne, Joshua J; Rassen, Jeremy A; Choudhry, Niteesh K et al. (2014) Near-real-time monitoring of new drugs: an application comparing prasugrel versus clopidogrel. Drug Saf 37:151-61
Rassen, Jeremy A; Shelat, Abhi A; Franklin, Jessica M et al. (2013) Matching by propensity score in cohort studies with three treatment groups. Epidemiology 24:401-9
Garbe, E; Kloss, S; Suling, M et al. (2013) High-dimensional versus conventional propensity scores in a comparative effectiveness study of coxibs and reduced upper gastrointestinal complications. Eur J Clin Pharmacol 69:549-57
Gagne, Joshua J; Walker, Alexander M; Glynn, Robert J et al. (2012) An event-based approach for comparing the performance of methods for prospective medical product monitoring. Pharmacoepidemiol Drug Saf 21:631-9
Gagne, Joshua J; Rassen, Jeremy A; Walker, Alexander M et al. (2012) Active safety monitoring of new medical products using electronic healthcare data: selecting alerting rules. Epidemiology 23:238-46
Rassen, Jeremy A; Schneeweiss, Sebastian (2012) Using high-dimensional propensity scores to automate confounding control in a distributed medical product safety surveillance system. Pharmacoepidemiol Drug Saf 21 Suppl 1:41-9
Gagne, Joshua J; Glynn, Robert J; Avorn, Jerry et al. (2011) A combined comorbidity score predicted mortality in elderly patients better than existing scores. J Clin Epidemiol 64:749-59
Huybrechts, Krista F; Brookhart, M Alan; Rothman, Kenneth J et al. (2011) Comparison of different approaches to confounding adjustment in a study on the association of antipsychotic medication with mortality in older nursing home patients. Am J Epidemiol 174:1089-99
Schneeweiss, S; Gagne, J J; Glynn, R J et al. (2011) Assessing the comparative effectiveness of newly marketed medications: methodological challenges and implications for drug development. Clin Pharmacol Ther 90:777-90

Showing the most recent 10 out of 16 publications