Rapid growth in the clinical implementation of large electronic medical records (EMRs) has led to an unprecedented expansion in the availability of dense longitudinal datasets for observational research. More recently, huge efforts have linked EMR databases with archived biological material, to accelerate research in personalized medicine. EMR- linked DNA biobanks have identified common and rare genetic variants that contribute to risk of disease. An appealing vision, which has not been extensively explored, is to use EMRs-linked biobanks for pharmacogenomic studies, which identify associations between genetic variation and drug efficacy and toxicity. The longitudinal nature of the data contained within EMRs make them ideal for quantifying drug outcome (both efficacy and toxicity). Efforts are already underway to link these EMRs across institutions, and standardize the definition of phenotypes for large-scale studies of treatment outcome, specifically within the context of routine clinical care. Despite its success, EMR-based pharmacogenomic studies are often hampered by its data-intensive nature -- it is time- consuming and costly to extract and integrate data from multiple heterogeneous EMR databases, for large-scale pharmacogenomic studies. The Informatics for Integrating Biology and the Bedside (i2b2) is a National Center for Biomedical Computing based at Partners Healthcare System. I2b2 has developed a scalable informatics framework to enable clinical researchers to repurpose existing EMR data for clinical and genomic discovery. In this study, we will collaborate with i2b2 to extend its informatics framework to the pharmacogenomics domain, by proposing the following specific aims: 1) Develop new methods to extract and model drug exposure and outcome information from EMR and integrate them with the i2b2 NLP components;2) Build ontology tools to normalize and integrate pharmacogenomic data across different sites;3) Conduct known and novel pharmacogenomic studies to evaluate and refine tools developed in Aim 1 and 2;and 4) Disseminate the developed informatics tools among pharmacogenomic researchers.
Longitudinal electronic medical records (EMRs) linked with DNA biobanks have become valuable resources for genomic and pharmacogenomics research, allowing identification of associations between genetic variations and drug efficacy and toxicity. The Informatics for Integrating Biology and the Bedside (i2b2), a National Center for Biomedical Computing based at Partners Healthcare System, has developed a scalable informatics framework to enable clinical researchers to use existing EMR data for genomic knowledge discovery of diseases. In this study, we will collaborate with i2b2 to extend its informatics framework to the pharmacogenomics domain, by developing new natural language processing, ontology components, and user-friendly interfaces, and then apply these tools to real-world pharmacogenomic studies.