“For the man who practices surgery, there are two kinds of mortality—chance and intentional. Chance mortality is the kind, which occurs unexpectedly… Intentional mortality is incurred by the chief surgeon when he attempts cases in which the condition is acknowledged to be grave”
—E. A. Codman1
Since their introduction by Dr. Ernest A. Codman in the early 20th century, morbidity and mortality (M&M) conferences have become standard practice in modern medicine.2–4 Many institutions now uphold the M&M tradition5 by presenting all complications and deaths, whereas others have applied new formats, for example, by mapping learning lessons to improvement aims or core competencies.6
Advances in “big data” and outcomes research offer new quality improvement (QI) tools to facilitate comparison of healthcare outcomes. For example, big data analyses have identified patient-specific risk factors for the development of disease; provider-specific performance metrics; and population-level risk-adjustment algorithms to compare complication rates across institutions and regions.
However, the application of big data and outcomes research to QI has not been fully realized. One area that may be particularly conducive to big data approaches is the M&M conference.
Consider the following 4 patients admitted to a hospital's surgical service:
- A healthy 21-year-old female presents with acute appendicitis, undergoes uncomplicated laparoscopic appendectomy, and is discharged the same day.
- A healthy 45-year-old male presents for elective inguinal hernia repair. Intraoperatively, an enterotomy is made and 3 days later, he becomes septic and requires urgent reoperation with bowel resection. Postoperatively he develops a pulmonary embolism and is placed on therapeutic anticoagulation. He is discharged from hospital on day 14.
- A 60-year-old female with atrial fibrillation presents to the emergency department with 10/10 abdominal pain and hemodynamic instability caused by acute mesenteric ischemia. She is taken emergently to the operating room where, upon induction of anesthesia, she suffers a cardiac arrest. Cardiopulmonary resuscitation is performed with return of spontaneous circulation. Arterial bypass and bowel resection are performed. After a challenging hospital course, she recovers to her baseline health.
- A 30-year-old male presents to the trauma bay after high-speed rollover motor vehicle accident, with profound shock caused by internal hemorrhage. Despite rapid initiation of massive transfusion protocol, the patient loses vital signs. Emergent thoracotomy is unsuccessful, and he is pronounced dead in the trauma bay.
In the examples above, 2 patients experienced poor outcomes (patients second and fourth). Experts would consider the second patient's outcome “unexpected,” given the low complication rates of routine inguinal hernia repair; by contrast, they would describe the fourth patient's outcome as “expected,” given the severity of presentation and the low likelihood of survival after ED thoracotomy. Similarly, the other 2 patients (patients first and third) experienced successful outcomes; the third patient's return to baseline health could be considered “unexpected” in light of her significant risk of a poor outcome at the time of presentation, whereas the first patient's outcome was quite “expected,” because of the routine nature of her presentation and procedure.
Practicing clinicians intuitively understand expected and unexpected outcomes. They have likely experienced examples of each. These distinctions are referenced in passing through casual statements such as: “number 1 patient was routine,” “number 2 was very unlucky,” “it is a miracle that number 3 survived,” or “there is nothing we could have done for number 4.” For patients number 1 and 3—successful outcomes—the discussion often stops there. However, patients number 2 and 4—unsuccessful outcomes—are presented at M&M conferences as examples of failure.
A NEW PRINCIPLE: “OUTCOMES EXPECTEDNESS”
These clinical scenarios represent 4 archetypes of patient outcomes in healthcare: expected successes, unexpected failures, unexpected successes, and expected failures, respectively (Fig. 1). As described above, many patients’ outcomes can be categorized into these groups using clinical intuition. However, advances in risk-adjustment methods, including those derived from big data analyses, now make it possible to objectively (ie, statistically) and automatically (ie, electronically) categorize patient outcomes into these groups. By applying such methods, we can bring new levels of rigor to QI efforts, including M&M conferences.
A New Approach
In our theoretical model, each patient cared for within a surgical department would be categorized into 1 of these 4 groups before the M&M conference.
There are many benefits to this approach. First, informed decisions can be made about whom to present during M&M. Instead of aiming to review all failures, a program may choose to prioritize unexpected failures (Box 2), which are arguably richer in improvement lessons than expected failures (Box 4). This approach could shift the focus of M&M discussions away from assigning preventability, and instead towards identifying opportunities for improvement. We believe that this would improve the educational value of M&M, while deepening clinicians’ understanding of the clinical and systems issues that contribute to quality deficits.
Second, unexpected successes (Box 3) could be introduced into M&M conferences and studied more systematically. These cases are typically rich in learning lessons; however, they are deemphasized or altogether excluded from current M&M discussions. By studying what went right for these patients, we may find new patterns—because of patient factors, clinician/team factors, or systems factors—that consistently promote unexpectedly positive outcomes. Some of these lessons will in turn promote new QI efforts.
Additional benefits of this approach include: (1) improved efficiency of M&M through the application of objective data to the case-selection process, and anticipation of the educational and/or systems-improvement value of specific cases; (2) enhanced atmosphere of M&M by celebrating unexpected successes; and (3) creation of an evidence base to counter potential financial and reputational consequences for expected failures (Box 4).
How Do We Get There?
Two key tools will be needed to facilitate this process: (1) accurate, patient-level risk-adjustment models (ie, risk scores), and (2) an objective definition of the optimal cutoff point between expected and unexpected results.
Current patient-level risk-adjustment models are not perfect; however, they are improving rapidly with the help of outcomes research. Two particularly informative examples are the National Surgical Quality Improvement Program (NSQIP) variables “mortprob” and “morbprob,” which are patient-specific, procedure-adjusted, risk-adjusted predictions of 30-day mortality and morbidity, respectively. These risk-scores are assigned to each patient captured by NSQIP.
Using these or similar risk-scores, we can begin to define an objective threshold, or “cutoff point” to differentiate between expected and unexpected outcomes. We can then use statistical approaches to assess the performance of this threshold in detecting true signals (ie, unexpected outcomes) from noise (ie, expected outcomes). Unfortunately, there is no standard approach for defining such a cutoff. One simple approach would be to define the cutoff at 50% probability of an adverse outcome; the interpretation would then be “more or less likely than not.” Another approach would be to define the cutoff at the average rate of an adverse outcome in the population of interest; the interpretation would then be “above or below average risk.” Alternatively, we can draw from relevant theoretical models and statistical process control methodologies outside of healthcare. For example, signal detection theory offers validated methods to mathematically establish an optimal threshold to distinguish signal from noise; crossing the threshold triggers the signal, which can then be investigated to determine the signal's cause.7
One tangible application of signal detection theory is Youden's index8 (ie, J-statistic), a statistical tool in use since the 1950 s that defines the optimal cutoff point (c*) for the differentiating ability of a test using the sensitivity and specificity of that test. When applied to a database of surgical patients containing patient-specific mortality risks and postoperative outcomes, Youden's Index can be used to calculate the optimal threshold for expected versus unexpected deaths. For example, using the 2011–2012 national American College of Surgeons-NSQIP (ACS-NSQIP) database of general and vascular surgery patients (n = 601,609), the Youden index-defined mortality risk threshold would be 1.4%. Patients below this threshold (ie, 30-day mortality risk <1.4%) who died within 30 days of surgery would trigger the signal and be categorized as unexpected deaths. Raising the mortality risk threshold (ie, setting the cutoff point >1.4%) would increase the sensitivity of the system, triggering more signals (ie, classifying more cases as unexpected deaths, but at a risk of decreasing specificity, or creating too many “false positives” (ie, deaths that turn out to be expected and unavoidable upon further investigation). Conversely, lowering the mortality risk threshold would increase the specificity of the system, thus minimizing false positives, but at the expense of potentially missing true instances of unexpected deaths.
Other decision-support strategies to facilitate accurate identification of expected and unexpected outcomes will be valuable as well. Sophisticated approaches have been employed in fields as diverse as operations research, finance, and engineering to answer similar questions, and may serve as valuable adjuncts to inform our understanding of outcomes expectedness. The specific methods and metrics used to apply outcomes expectedness in clinical practice must be further investigated and debated.
Outcomes expectedness is an intuitive theoretical framework with which patient outcomes can be studied systematically, and objectively. By linking outcomes research and big data analyses to QI, we may together reinvent the modern day M&M conference and other QI efforts as well. QI expert Dr. Donald Berwick recently noted that “end-results information (ie, outcomes data), although necessary for improvement, is not sufficient.”9 What we do with the data is what ultimately determines its utility.
1. Codman EA. A study in hospital efficiency: as demonstrated by the case report of the first five years of a private hospital. Boston: Th Todd Co; 1918.
2. Rosenfeld JC. Using the morbidity and mortality conference to teach and assess the ACGME general competencies. Curr Surg
3. Harbison SP, Regehr G. Faculty and resident opinions regarding the role of morbidity and mortality conference. Am J Surg
4. Accreditation Council of Graduate Medical Education Program Requirements for Graduate Medical Education in General Surgery. ACGME-approved: October 1, 2011; Effective: July 1, 2012. ACGME approved focused revision with categorization: September 29, 2013; effective: July 1, 2014.
5. Hutter MM, Rowell KS, Devaney LA, et al. Identification of surgical complications and deaths: an assessment of the traditional surgical morbidity and mortality conference compared with the American College of Surgeons-National Surgical Quality Improvement Program. J Am Coll Surg
6. Bingham JW, Quinn DC, Richardson MG, et al. Using a healthcare matrix to assess patient care in terms of aims for improvement and core competencies. Jt Comm J Qual Patient Saf
7. Bottle A, Aylin P. Predicting the false alarm rate in multi-institution mortality monitoring. J Oper Res Soc
8. Youden WJ. Index for rating diagnostic tests. Cancer
9. Berwick DM. Measuring surgical outcomes for improvement: was Codman wrong? JAMA