Journal Logo


Imagine…(A Common Language for ICU Data Inquiry and Analysis)

Kaplan, Lewis J. MD, FACS, FCCM; Cecconi, Maurizio MD; Bailey, Heatherlee MD, FCCM; Kesecioglu, Jozef MD, PhD

Author Information
doi: 10.1097/CCM.0000000000004166
  • Free

Patient data forms the core of clinical inquiry and is the foundation on which trial outcomes, evidence-based medicine, benchmarking, quality improvement and guidelines rest. Sources of patient data span in- and outpatient domains and are filtered across a variety of health records, not all of which are electronic, and few of which are interoperable (1). Worse, databases that house administrative or clinical data often employ disparate structures and definitions. Entering new data after updating aged definitions may paralyze a database’s ability to compare new entries against a prior baseline. It is in these spaces that one may imagine a future that leverages common data definitions, data scientist integration into critical care teams and workflow. This work represents the shared vision of the leaderships of the European Society of Intensive Care Medicine and the Society of Critical Care Medicine in exploring how common data definitions, data science, and data sharing may impact clinical care, quality improvement, and scientific inquiry in critical care.

Patient data continuously informs clinicians to direct care. Increasingly, databases are mined for patient level data to run in silico instead of in vivo analyses (2). However, these “trials” are run in curated datasets whose architecture and definitions may be so different that data cannot be combined to create larger datasets. For a given patient, care in the prehospital setting (especially after injury) may use an established data dictionary and database that is different from the one used during in-hospital, rehabilitation, or after-care (3). Although diagnoses may be differently represented, the patient remains the same. The totality of that data may therefore be lost from clinical trials, guidelines, and practice program development. Accordingly, patient data cannot be readily used outside of specific and location-based segments of care.

Acute kidney injury (AKI) provides an apt example of the impact of different definitions for the same condition. Older databases defined AKI as acute renal failure (ARF) based on a serum creatinine greater than or equal to 2.0 mg% (176.8 micromol/L) (4). Serum creatinine increases above a baseline that failed to reach 2 mg% was instead termed acute renal insufficiency. The old definition failed to identify patients whose kidneys suffered injury with important consequences. Both definitions have been replaced by AKI which is further refined into three stages (5). Therefore, in a single database, one may find patients with the same code for ARF but who have different creatinine triggers, hospital and ICU lengths of stay, and recovery trajectories. Without careful database reconfiguration, disparate definitions may support erroneous guideline generation as well as potentially avoidable medical error. These disparities underscore the need to improve data fidelity and usability where multiple definitions coexist or have changed over time.

We propose three strategies that could be used to harmonize data across different sites and databases may be used. The first is to establish a single data dictionary so that regardless of care location, only one definition is used throughout the health record. Uniformity would benefit clinical inquiry and clinical trials. Second, a minimum data set should be articulated for each segment of care, reflecting a longitudinal approach to analysis (6). Third, the data dictionary definitions should map to unique codes employed in the International Classification of Diseases grouping of medical diagnoses that ties to billing and reimbursement (7). Imagine a common ICU data dictionary and a minimum ICU data set that ICUs around the world could use to share deidentified patient data. The same data, inputted just once, could feed different databases for internal quality improvement, benchmarking, network monitoring, or to populate observational studies. Machine learning algorithms could use these data structure for real time analysis and inform decision support systems enhancing patient and family care (8,9). This process may be augmented by the rapid inclusion of data science and data scientists in clinical medicine.

Data scientists perform much more than data analysis (Fig. 1). They craft data architecture, generate code, and help glean answers to clinically relevant questions from vast data arrays (10). They direct database construction to facilitate inquiry in a way that supports interoperability while creatively solving existing problems. Establishing dual codes for a diagnosis that informs different data warehouses is not a solution apparent to most clinicians—but might address the issues with AKI and ARF, and provides an alternative to working with raw data. The volume of raw data precludes its use in a minimum data set but would be ideal for machine learning-based analysis (11). The ability to conceive of data in a different format, and with different connectivity, underscores the value of a data scientist as an essential member of the clinical team. Accordingly, we could leverage their skills to build an ICU Data warehouse using a globally shared vision (12).

Data sharing will support rapid validation of new algorithms. Algorithms developed in one hospital could be tested in another using an identical data architecture. This would massively increase the rate at which we discover and assess new treatment bundles and guidelines such as those of the Surviving Sepsis Campaign (13). Shared analyses of patient flow, care, and outcomes after disaster management could be combined across sites to generate data for development of best practices and improve future disaster planning.

Accordingly, we could envision an ICU database that incorporates patient level information from millions of patients instead of the much smaller databases that are commonly explored by clinicians partnered with data scientists to answer clinical questions in novel fashions (14). The power of that level of “big data” to run adaptive trials in silico is unprecedented in clinical medicine. Power issues for post hoc subgroup analyses could be conquered using a global database. Enhanced patient matching could be a reality, as well as more precisely focused trial design, enhanced enrollment rates, and study completion.

Big data, data scientists, and a shared community that includes patient digital data will shape the future of critical care. Such an approach removes barriers to clinical inquiry, research, and patient care. The success of this digital revolution relies on choosing to imagine a different way to learn about, share, and deliver care. In this way, evoking John Lennon’s timeless words, there would truly be “no countries” separating critical care practice and discovery. There may be no more audacious goal in advancing critical care medicine than speaking a common language across a global collaborative.

Figure 1.
Figure 1.:
This graphic represents a process map of how data scientists interface with patient level data, and clinicians to facilitate database anchored data curation and inquiry.


1. Sittig DF, Wright A. What makes an EHR “open” or interoperable? J Am Med Inform Assoc 2015; 22:1099–1101
2. Bajard A, Chabaud S, Cornu C, et al.; CRESim & Epi-CRESim study groups: An in silico approach helped to identify the best experimental design, population, and outcome for future randomized clinical trials. J Clin Epidemiol 2016; 69:125–136
3. Hornor MA, Hoeft C, Nathens AB. Quality benchmarking in trauma: From the NTDB to TQIP. Curr Trauma Rep 2018; 4:160
4. Mehta R, Bihorac A, Selby NM, et al.; Acute Dialysis Quality Initiative (ADQI) Consensus Group: Establishing a continuum of acute kidney injury - tracing AKI using data source linkage and long-term follow-up: Workgroup Statements from the 15th ADQI Consensus Conference. Can J Kidney Health Dis 2016; 3:13
5. Kidney Disease: Improving Global Outcomes: KDIGO 2012 AKI Guideline. 2012. Available at: Accessed May 19, 2019
6. Tirkkonen J, Ylä-Mattila J, Olkkola KT, et al. Factors associated with delayed activation of medical emergency team and excess mortality: An Utstein-style analysis. Resuscitation 2013; 84:173–178
7. World Health Organization: International Classification of Diseases. Available at: Accessed May 19, 2019
8. Obermeyer Z, Emanuel EJ. Predicting the future - big data, machine learning, and clinical medicine. N Engl J Med 2016; 375:1216–1219
9. Johnson AE, Ghassemi MM, Nemati S, et al. Machine learning and decision support in critical care. Proc IEEE Inst Electr Electron Eng 2016; 104:444–466
10. Darcy AM, Louie AK, Roberts LW. Machine learning and the profession of medicine. JAMA 2016; 315:551–552
11. Li Y, Wu FX, Ngom A. A review on machine learning principles for multi-view biological data integration. Brief Bioinform 2018; 19:325–340
12. Lin YL, Guerguerian AM, Tomasi J, et al. “Usability of data integration and visualization software for multidisciplinary pediatric intensive care: A human factors approach to assessing technology”. BMC Med Inform Decis Mak 2017; 17:122
13. Levy MM, Evans LE, Rhodes A. The surviving sepsis campaign bundle: 2018 Update. Crit Care Med 2018; 46:997–1000
14. Nunez Reiz A, Martinez Sagasti F, Álvarez González M, et al.; Organizing Committee Of The 2017 Madrid Critical Care Datathon: Big data and machine learning in critical care: Opportunities for collaborative research. Med Intensiva 2019; 43:52–57

clinical trial; critical care; data science; database; electronic health record; quality

Copyright © 2020 by the Society of Critical Care Medicine and the European Society of Intensive Care Medicine. All Rights Reserved.