The large and growing size of the healthcare system makes it imperative to understand what is happening to us, the recipients of healthcare, to be able to efficiently conduct research to improve healthcare delivery and to improve the state of biomedicine by advancing its science. i2b2, """"""""Informatics for Integrating Biology and the Bedside"""""""" seeks to provide this instrumentation using the informational by products of healthcare and the biological materials accumulated through the delivery of healthcare. This complements existing efforts to create prospective cohort studies or trials outside the delivery of routine healthcare. In the first round of i2b2, we demonstrated that we could identify known adverse events and phenotypically select and then genotype patients for genetic association at approximately 1/10* of the price and less than l/10 of the time usually entailed to develop such populations for study. The challenge we have set ourselves for the next methodological challenge in i2b2 is the development of Virtual Cohort Studies (VCS) encompassing the population of a healthcare system as study subjects and asking questions of comparative effectiveness, unforeseen adverse events and identification of clinically relevant subpopulations including both clinical and genome-scale measures. We will be comparing the results of the VCS to those of carefully planned and executed cohort studies such as the Framingham Heart Study. VCS will require multiple methodological advances and tools development including in the disciplines of natural language processing, temporal reasoning, predictive modeling, biostatistics and machine learning. VCS methods will be tested by two driving biology projects, the first studying a collection of autoimmune diseases and the second type 2 diabetes. In both projects, VCS methods will be applied to investigate the components of cardiovascular risk from the genetic to the epigenetic and including the full range of clinical history including medications exposure. A systems/integrative approach will be taken to identify commonalities in these risk profiles across these disparate disease domains. VCS methods will be shared with i2b2 user community under open source governance while i2b2 user community contributions are folded into the i2b2 toolkit.

Public Health Relevance

The i2b2 model of using existing clinical data for high throughput, cost effective, and timely research promises to rapidly leverage our nation's ability to better understand the existing disease burden and how best to achieve cost-effective clinical effectiveness, including new drug targets, new uses of existing drugs and improved management of existing disease.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Specialized Center--Cooperative Agreements (U54)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-BST-K (52))
Program Officer
Florance, Valerie
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Brigham and Women's Hospital
United States
Zip Code
Murphy, Shawn N; Avillach, Paul; Bellazzi, Riccardo et al. (2017) Combining clinical and genomics queries using i2b2 - Three methods. PLoS One 12:e0172187
Hundemer, Gregory L; Baudrand, Rene; Brown, Jenifer M et al. (2017) Renin Phenotypes Characterize Vascular Disease, Autonomous Aldosteronism, and Mineralocorticoid Receptor Activity. J Clin Endocrinol Metab 102:1835-1843
Nanba, Kazutaka; Vaidya, Anand; Williams, Gordon H et al. (2017) Age-Related Autonomous Aldosteronism. Circulation 136:347-355
Luo, Yuan; Uzuner, Özlem; Szolovits, Peter (2017) Bridging semantics and syntax with graph algorithms-state-of-the-art of extracting biomedical relations. Brief Bioinform 18:160-178
Bigdeli, T B; Ripke, S; Peterson, R E et al. (2017) Genetic effects influencing risk for major depressive disorder in China and Europe. Transl Psychiatry 7:e1074
Luo, Yuan; Szolovits, Peter (2016) Efficient Queries of Stand-off Annotations for Natural Language Processing on Electronic Medical Records. Biomed Inform Insights 8:29-38
Castro, V M; Kong, S W; Clements, C C et al. (2016) Absence of evidence for increase in risk for autism or attention-deficit hyperactivity disorder following antidepressant exposure during pregnancy: a replication study. Transl Psychiatry 6:e708
Lin, Chen; Dligach, Dmitriy; Miller, Timothy A et al. (2016) Multilayered temporal modeling for the clinical domain. J Am Med Inform Assoc 23:387-95
Ananthakrishnan, Ashwin N; Cagan, Andrew; Cai, Tianxi et al. (2016) Identification of Nonresponse to Treatment Using Narrative Data in an Electronic Health Record Inflammatory Bowel Disease Cohort. Inflamm Bowel Dis 22:151-8
Corey, Kathleen E; Kartoun, Uri; Zheng, Hui et al. (2016) Using an Electronic Medical Records Database to Identify Non-Traditional Cardiovascular Risk Factors in Nonalcoholic Fatty Liver Disease. Am J Gastroenterol 111:671-6

Showing the most recent 10 out of 304 publications