Adverse drug reactions (ADRs) are a major burden for patients and healthcare, causing preventable hospitalizations and deaths, and incurring a huge cost. The long-term objective of this proposal is to advance patient safety and reduce costs by discovering novel serious ADRs through use of automated methods that combine information from large and varied patient populations as well as from the literature. There have been considerable advances in pharmacovigilance, but more work is needed. For example, Vioxx, a commonly used drug, was recently found to cause at least 88,000 occurrences of myocardial infarction, highlighting the insufficiency of current methods. To date, methods have mainly depended on the use of single sources of data, primarily from the Federal Food and Drug Administration Adverse Event Reporting System (FAERS) and from electronic health records (EHRS). Although important, each of the sources has different limitations and advantages, and therefore, combining the data across them should lead to more effective drug safety surveillance by increasing the statistical power, and also by allowing each data source to complement the other sources. We already have developed methods associated with each of the single sources, and therefore, this is an excellent opportunity to build upon our research accomplishments to advance the state of the art in pharmacovigilance. More specifically, we will a) acquire and combine comprehensive clinical data from the electronic health records (EHRs) of two different health care sites serving diverse populations by utilizing natural language processing (NLP) to obtain vast quantities of fine-grained data, and then by developing data mining methodologies on the clinical data to detect novel ADR signals, b) analyze differences in therapy-related risk factors between the two EHR populations, such as racial and ethnic differences, c) detect ADR signals in the FAERS database using an established methodology, d) develop improved methods to acquire ADR signals based on information in the literature, and e) develop methods that utilize the results from the above sources to maximize effectiveness. We will focus on eight serious ADRs, and collect a high-quality reference standard for those ADRs so that we will be able to evaluate and compare performance of the different detection methods individually as well as the methods that combine the sources. This proposal is well positioned to overcome problems associated with existing automated methods, which are primarily based on use of individual sources of data. We are confident the methods will be effective because a strong infrastructure is in place for us to build upon. Most importantly, the methodology developed in this proposal presents an excellent chance to leverage heterogeneous data sources to dramatically improve patient safety and reduce costs.

Public Health Relevance

Adverse drug reactions (ADRs) are a major burden for patients and health care, causing preventable hospitalizations and deaths, and incurring huge costs, and, therefore, continuous post-marketing surveillance is crucial for patient safety. This proposal aims to improve patient safety and reduce health care costs by developing effective methods to discover new adverse drug reactions through the combination of information in the FDA's Adverse Event Reporting System, the literature, and comprehensive clinical data from electronic health records of two different sites with diverse populations, thereby overcoming limitations that rely mainly on use of one data source.

National Institute of Health (NIH)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZLM1)
Program Officer
Sim, Hua-Chuan
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Columbia University (N.Y.)
Internal Medicine/Medicine
Schools of Medicine
New York
United States
Zip Code
Vilar, Santiago; Uriarte, Eugenio; Santana, Lourdes et al. (2014) Similarity-based modeling in large-scale prediction of drug-drug interactions. Nat Protoc 9:2147-63
Li, Ying; Salmasian, Hojjat; Vilar, Santiago et al. (2014) A method for controlling complex confounding effects in the detection of adverse drug reactions using electronic health records. J Am Med Inform Assoc 21:308-14
Freedberg, Daniel E; Salmasian, Hojjat; Friedman, Carol et al. (2013) Proton pump inhibitors and risk for recurrent Clostridium difficile infection among inpatients. Am J Gastroenterol 108:1794-801
Vilar, Santiago; Uriarte, Eugenio; Santana, Lourdes et al. (2013) Detection of drug-drug interactions by modeling interaction profile fingerprints. PLoS One 8:e58321
Liu, X Sherry; Wang, Ji; Zhou, Bin et al. (2013) Fast trabecular bone strength predictions of HR-pQCT and individual trabeculae segmentation-based plate and rod finite element model discriminate postmenopausal vertebral fractures. J Bone Miner Res 28:1666-78
Yadav, Kabir; Sarioglu, Efsun; Smith, Meaghan et al. (2013) Automated outcome classification of emergency department computed tomography imaging reports. Acad Emerg Med 20:848-54
Salmasian, Hojjat; Freedberg, Daniel E; Abrams, Julian A et al. (2013) An automated tool for detecting medication overuse based on the electronic health records. Pharmacoepidemiol Drug Saf 22:183-9
Harpaz, Rave; Vilar, Santiago; Dumouchel, William et al. (2013) Combing signals from spontaneous reports and electronic health records for detection of adverse drug reactions. J Am Med Inform Assoc 20:413-9
Harpaz, R; DuMouchel, W; Shah, N H et al. (2012) Novel data-mining methodologies for adverse drug event discovery and analysis. Clin Pharmacol Ther 91:1010-21
Harpaz, R; Perez, H; Chase, H S et al. (2011) Biclustering of adverse drug events in the FDA's spontaneous reporting system. Clin Pharmacol Ther 89:243-50

Showing the most recent 10 out of 17 publications