The Coronavirus Disease 2019 (COVID-19) pandemic has caught the world off guard, reshaping ways of life, the economy, and healthcare delivery. Data in electronic health records (EHRs) should be widely available to study COVID-19 but have not yet been effectively shared across clinical sites, with public health agencies, or with policy makers. There are several large, national and international projects to build informatics infrastructure to analyze the EHR data of patients with COVID-19. However, aggregating data from multiple EHRs only works if you can trust the final results. This means being able to go back to each site and talk to the people who know the data best, to understand the local clinical guidelines, coding practices, data quality problems, and other factors that affect the data. In March, 2020, we launched an international effort called the Consortium for Clinical Characterization of COVID-19 by EHR (4CE). It brings together more than 100 informatics experts, statisticians, and ICU doctors from around the world. The novel aspect of 4CE is that we recognize the complexities of EHR data and the need to directly involve the local data experts, not only in the data collection, but also in the development of research questions and the data analyses. We try to move fast, believing that early intelligence is worth more than complete intelligence later. To do this, we avoid roadblocks that typically slow down informatics projects, such as building or installing new software, or the regulatory hurdles involved in sharing patient-level data. Instead, we ask participating sites to run analyses locally, using simple existing tools, like SQL, R, and Python scripts, and only share aggregate counts and statistics centrally with the rest of the 4CE consortium. We review and validate the data as a group, identify and fix data quality problems, and ask sites to repeat the analyses until everything is right. Through multiple cycles of data verification, we iteratively clean up the data and gain confidence that the findings we are seeing are real. Because we can do this quickly, we go from research question to results in just a few weeks. This proposed project will ?productize? the 4CE approach, through three Specific Aims: (1) Transition 4CE to ?Phase 2?, where sites will begin more complex local analyses. We will develop Phase 2 analysis scripts; update our data upload, validation, and visualization websites; and, test the Phase 2 scripts at three sites before expanding to the rest of the consortium. (2) Demonstrate and evaluate 4CE through two use cases. We will refine and validate an algorithm for identifying COVID-19 patients with ?severe? disease and use 4CE to characterize central nervous system complications in COVID-19. (3) Develop a plan for integrating with complementary efforts and long-term sustainability. As part of this, we will create a guide that shows sites how to use 4CE data extracts and quality checks to support other COVID-19 informatics projects, including the generation of OMOP files.
Data in electronic health records (EHRs) should be widely available to study COVID-19 but have not yet been effectively shared across clinical sites, with public health agencies, or with policy makers. The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) addresses this problem by running analyses locally at more than 100 hospitals worldwide and sharing the aggregate results with the public through interactive data visualizations.
Showing the most recent 10 out of 30 publications