Patients in hospital intensive care units (ICUs) are physiologically fragile and unstable, generally have life-threatening conditions, and require close monitoring and rapid therapeutic interventions. They are connected to an array of equipment and monitors, and are carefully attended by the clinical staff. Staggering amounts of data are collected daily on each patient in an ICU: multi-channel waveform data sampled hundreds of times each second, vital sign time series updated each second or minute, alarms and alerts, lab results, imaging results, records of medication and fluid administration, staff notes and more. Petabytes of data are captured daily during care delivery in the country's ICUs;however, most of these data are not used to generate evidence or to discover new knowledge. The technology now exists to collect, archive and organize finely detailed ICU data, resulting in research resources of enormous potential. Since 2003, our group has been building the Multi-parameter Intelligent Monitoring in Intensive Care II (MIMIC II) Database, which now holds clinical data from about 40,000 entire stays in the ICUs of the Beth Israel Deaconess Medical Center (BIDMC) in Boston, including waveform data (continuous multi-channel recordings of physiologic signals and vital signs) for a subset of these stays. We have meticulously de-identified the data and freely shared them with the research community via the PhysioNet web site. The database is an unparalleled research resource and its value is widely recognized. More than 725 researchers have no-cost access to the clinical data under data use agreements (DUAs). This worldwide community includes academic, clinical, and industrial investigators from more than 32 countries and is growing by over 50% per year. In addition, thousands of investigators, educators, and students have used the waveform data, which we have made freely available to all without restriction. MIMIC II's demonstrated and substantial relevance for research can be enhanced by incorporation of new data, reflecting changes in patient populations, public health challenges, available medications, clinical interventions, and care guidelines, and by development of advanced software to facilitate user access to MIMIC II. Its value can be further enhanced by integration of data from multiple centers. This proposal seeks funding: a) to maintain, enhance, and document the open-source software that we have created to build and update MIMIC II, to incorporate established and emerging standards, and to provide the tools needed to create parallel data collections at other centers;b) to establish the first public, multi-center, international, scalable, continuously updatable, high-resolution data archive for critical care research;and c) to create new knowledge and to develop clinical tools, based on the data archive, to inform and support clinical decisions and practice in critical care.

Public Health Relevance

Enormous amounts of data are routinely collected from patients in hospital intensive care units, but most of these data are not used to generate evidence or to discover new knowledge. This project will collect, organize, and publicly distribute detailed, de-identified, clinical and physiologic data from massive numbers of intensive care patients, and provide software tools to support the user community in exploring and mining the data. The database will catalyze and support a wide variety of biomedical engineering and clinical studies that will result in new understanding and patient-specific prognostic and therapeutic guidance for critical care.

National Institute of Health (NIH)
National Institute of Biomedical Imaging and Bioengineering (NIBIB)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Luo, James
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Massachusetts Institute of Technology
Engineering (All Types)
Biomed Engr/Col Engr/Engr Sta
United States
Zip Code
Celi, Leo Anthony; Lokhandwala, Sharukh; Montgomery, Robert et al. (2016) Datathons and Software to Promote Reproducible Research. J Med Internet Res 18:e230
Lynch, Katherine E; Ghassemi, Fatimah; Flythe, Jennifer E et al. (2016) Sodium modelling to reduce intradialytic hypotension during haemodialysis for acute kidney injury in the intensive care unit. Nephrology (Carlton) 21:870-7
Johnson, Alistair E W; Ghassemi, Mohammad M; Nemati, Shamim et al. (2016) Machine Learning and Decision Support in Critical Care. Proc IEEE Inst Electr Electron Eng 104:444-466
Naidus, Elliot; Celi, Leo Anthony (2016) Big data in healthcare: are we close to it? Rev Bras Ter Intensiva 28:8-10
Shrime, Mark G; Ferket, Bart S; Scott, Daniel J et al. (2016) Time-Limited Trials of Intensive Care for Critically Ill Patients With Cancer: How Long Is Long Enough? JAMA Oncol 2:76-83
Moskowitz, Ari; Lee, Joon; Donnino, Michael W et al. (2016) The Association Between Admission Magnesium Concentrations and Lactic Acidosis in Critical Illness. J Intensive Care Med 31:187-92
Hoogendoorn, Mark; Szolovits, Peter; Moons, Leon M G et al. (2016) Utilizing uncoded consultation notes from electronic medical records for predictive modeling of colorectal cancer. Artif Intell Med 69:53-61
Lee, Joon; Mark, Roger G; Celi, Leo Anthony et al. (2016) Proton Pump Inhibitors Are Not Associated With Acute Kidney Injury in Critical Illness. J Clin Pharmacol 56:1500-1506
Johnson, Alistair E W; Pollard, Tom J; Shen, Lu et al. (2016) MIMIC-III, a freely accessible critical care database. Sci Data 3:160035
Danziger, John; Chen, Ken P; Lee, Joon et al. (2016) Obesity, Acute Kidney Injury, and Mortality in Critical Illness. Crit Care Med 44:328-34

Showing the most recent 10 out of 35 publications