The search for zero error rates is doomed from the start. - —DONALD M. BERWICK1
Stimulated by the Institute of Medicine report To Err is Human, published in 1999, the health care industry is rapidly mobilizing to address the problem of preventable errors in medicine.2 Although the direction in which we need to move is clear, the ultimate objective needs better definition. If we wish to reduce errors in medical care, what is the appropriate goal?
At the 1998 Annenberg Conference on Patient Safety, Nancy W. Dickey, MD, past president of the American Medical Association, asserted that “the only acceptable error rate is zero.”3 Gordon M. Sprenger, CEO of the Allina Health System and chair of the American Hospital Association Board of Trustees, reiterated this theme at the 2001 Annenberg safety colloquium: “Let's be absolutely clear on this: The goal of the patient safety movement must be to eliminate all errors. This is like climbing Mount Everest, but it must be our goal and it can be done.”4
Diagnostic errors comprise a substantial and costly fraction of all medical errors. In the Harvard Medical Practice Study of hospitals in New York State, diagnostic errors represented the second largest cause of adverse events.5 Similarly, diagnostic errors are the second leading cause for malpractice suits against hospitals.6 A recent study of autopsy findings identified diagnostic discrepancies in 20% of cases, and the authors estimated that in almost half of these cases knowledge of the correct diagnosis would have changed the treatment plan.7
Given their enormous human and economic impact, the complete elimination of diagnostic errors would seem to be an appropriate and worthwhile goal. The purpose of this review is to consider the available evidence regarding the feasibility of this endeavor. We approach this question by first identifying three major types of diagnostic errors. For each type, we then consider possible approaches to decreasing the incidence of diagnostic errors. Finally, we examine whether there are any practical or theoretical limits that would preclude us from eliminating diagnostic errors altogether.
TYPES OF DIAGNOSTIC ERRORS
The nature of clinical decision making has been clarified over the past several decades, and a variety of systems have been proposed to classify diagnostic errors.8,9,10,11,12,13,14,15 For the purposes of this discussion, we postulate that every diagnostic error can be assigned to one of three broad etiologic categories (Table 1):
* “No-fault errors,” following Kassirer and Kopelman, include cases where the illness is silent, or masked, or presents in such an atypical fashion that divining the correct diagnosis, with the current state of medical knowledge, would not be expected.13 Other examples would include the rare condition misdiagnosed as something more common, and the diagnosis missed because the patient does not present his or her symptoms clearly. A diagnosis missed or delayed because of patient noncompliance might also be viewed as a no-fault error.
* System errors reflect latent flaws in the health care system. Included in this category are weak policies, poor coordination of care, inadequate training or supervision, defective communication, and the many system factors that detract from optimal working conditions, such as stress, fatigue, distractions, and excessive workload. These problems can affect all the diagnosticians in the involved health care system.
* Cognitive errors are those in which the problem is inadequate knowledge or faulty data gathering, inaccurate clinical reasoning, or faulty verification.9,13 Examples include flawed perception, faulty logic, falling prey to biased heuristics, and settling on a final diagnosis too early. These are all errors on the part of an individual diagnostician.
Each of these three categories of diagnostic errors carries its own prognosis for error reduction, and we consider them each in turn.
Lucky is the patient (and his or her physician) whose disease presents in classical textbook fashion. For many conditions the classical presentation is the exception and the spectrum of possible disease presentations is broad. An example is the classical “thunderclap” headache of subarachnoid hemorrhage. This prototypical finding is only present in 20–50% of patients, and failure to appreciate the various other ways that such patients may present accounts for a substantial fraction of the patients with subarachnoid hemorrhage in whom the diagnosis is missed.16 Analogously, patients over 80 years old are more likely to have atypical presentations of myocardial infarction than the typical chest pain that is the hallmark of a heart attack in younger patients.17
Every week, the New England Journal of Medicine entertains and instructs clinicians with the “Case Records of the Massachusetts General Hospital.” In this venerable exercise, a case is presented to an expert clinician, who is challenged to identify the “correct” diagnosis. Many of the factors that contribute to diagnostic errors in everyday life have been carefully eliminated in these vignettes: There are no lab errors, no chance to fail to perceive the crucial finding, the clinician has the luxury of time to research all the relevant possibilities, and the diagnosis is made in the absence of everyday stress, fatigue, and distractions. Even in this idealized setting, however, the correct diagnosis is often missed. In the cases presented from 1989 to 1996, the error rate among all case discussants was 25% (excluding cases analyzed by the diagnostically gifted physicians from the presenting hospital, who had an error rate of only 5%!).18 In most cases these errors reflect the unusual ways in which even common conditions can manifest, and in some cases the illness being discussed may even represent a new disease, or a new variant. These no-fault errors are probably over-represented in cases chosen to be diagnostic challenges, but the ability of diseases to remain silent or present in atypical fashion is encountered in every clinical setting.
Can we reduce no-fault diagnostic errors? To the extent that no-fault diagnostic errors represent shortcomings of medical knowledge or testing, it is a virtual certainty that these errors will decrease over the long term as knowledge advances. Before the appreciation of Lyme disease as a specific entity and our ability to specifically test for the condition, all of these cases were misdiagnosed as, for example, atypical rheumatoid arthritis. The evolving ability to test for disease at a point when the clinical manifestations are minimal or absent is another way in which no-fault errors will be reduced. Consider the ability to detect pre-symptomatic cancer with appropriate screening tests, or the ability to detect silent hyperparathyroidism from the routine measurement of serum calcium. These examples illustrate the concept that advances in medical knowledge and disease detection will inevitably reduce the number of no-fault errors. This process may even accelerate as we become increasingly able to use genetic markers to detect disease predispositions before there are any clinical manifestations.
Can we eliminate no-fault diagnostic errors? As a practical matter, it seems unlikely that we will ever have a complete enough armamentarium to test for every possible disease. There will always be patients whose disease exists in a silent, preclinical stage, eluding detection. In other patients, the disease may be clinically manifest, but present in such an atypical fashion that the true diagnosis is missed. A final likelihood is that new diseases or new pathogens, or new side effects of yet-to-be invented medications, will emerge over time. The first patients to develop these novel entities will be misdiagnosed until the new syndromes are defined and characterized.
Other immortal no-fault errors are those arising from the use of normative approaches to choose the most likely diagnosis. For example, when faced with uncertainty, the normative approach suggests we should pick as a working diagnosis the entity with the highest likelihood. Averaged over all diagnoses this will produce the highest number of correct ones.19 As emphasized by Arkes, however, using the normative approach guarantees that, with some regularity, the clinician will choose the most likely diagnosis instead of the correct one.3 It is inevitable that a more rare condition will sometimes exist in cases where we suspect something more common. Although technically this should be considered a cognitive error, it is inappropriate to fault the clinician whose diagnosis is wrong as a result. As Arkes concludes, “There is no solution to the problem, which is an unavoidable consequence of the probabilistic nature of the relationship between disease and symptoms.”3
Several patient-specific factors contribute to the impossibility of eliminating no-fault diagnostic errors. One of these is patient noncompliance. Although many interventions are under way to improve compliance with medical care and participation in screening programs, patients' compliance is never guaranteed. Their ultimate participation, or lack of participation, in medical care may be influenced by busy personal schedules, religious beliefs, attraction to alternative medicine, or distrust.20
A second patient-related factor contributing to no-fault errors is the inherent variability in how patients perceive and describe their states of health or their active symptoms. The information a patient gives may be confusing, contradictory, or inaccurate.21 Without understanding each patient's personal background, context, and belief systems, it may be impossible for the physician to accurately comprehend the true state of affairs.
This problem illustrates the philosophical argument of “necessary fallibility” presented by Gorovitz and MacIntyre.22 This concept applies to all fields of cognitive endeavor, including clinical reasoning in medicine, and suggests that ultimately the state of the world is too complex to be fully knowable: “It is inherent in the nature of medical practice that error is unavoidable, not merely because of the limitations of human knowledge or even the limits of human intellect, but rather because of the fundamental epistemological feature of a science of particulars.” Science can never predict the exact course of a hurricane because of the infinitely many interacting environmental and topographical attributes.23 Similarly, medical diagnosticians will be forever challenged by the subtle and unknowable interplay of variables (in the disease agent, in the host response, in the environment, in how the patient describes his or her symptoms, in testing, and even in the physician's powers of observation) that determine how a disease will present itself and be perceived by the clinician. Kennedy has argued that the state of uncertainty surrounding clinical decision making is so profound and pervasive that it may be inappropriate to judge the quality of medical diagnosis using the usual standards of rational decision making.24
The prevailing paradigm acknowledges that error in medical care has two distinct roots: At the “sharp end” is the individual provider who interacts with the patient and makes the mistake. At the “blunt end” are the latent flaws in the health care system that provide the setting, the framework, and the predisposition for the error to occur.25,26 Blunt-end factors include the system's organizational structure, culture, policies and procedures, the resources provided, the ground rules for communication and interaction, and performance detractors such as excess provider workload.
Can we reduce diagnostic errors related to system issues? Reflecting work by Reason,25,27 Leape,28,29 and others,30,31 the dominant role of system factors has assumed center stage in both understanding and correcting errors in medicine. As summarized by Bogner, “a systems approach is necessary to effectively address human error in medicine,”32 and this theme was repeatedly endorsed in the Institute of Medicine report, To Err is Human.2 Compared with error-improvement strategies that focus on individual providers, system-level changes have the advantage of potentially decreasing error rates for all involved providers, and over extended periods of time.33
Laboratory errors provide an example of how powerful system interventions can be in reducing diagnostic errors. The scope and accuracy of medical diagnosis increased dramatically in the 20th century in parallel with the emergence of laboratory testing and other diagnostic tests, such as X-rays and electrocardiograms. In the beginning, however, the accuracy and reliability of medical testing varied widely. A survey in 1949 of 18 leading clinical laboratories in Connecticut found that over one third of the clinical lab results were unacceptable. Similarly, a mid-century national survey by the Centers for Disease Control and Prevention estimated that more than a fourth of all laboratory testing results nationwide were unacceptable.34 This problem eventually resulted in standardization and regulation, inspection, and insistence on quality control. Although lab errors are still with us, they are now rare as a result of such changes.
Diagnostic errors related to delays are common and represent a large area where system interventions could be effective. Delayed diagnosis often reflects inefficiency in diagnostic evaluation, suboptimal coordination of care, or lack of effective communication. These are all system factors that can be optimized through attention to system design and performance.
Can we eliminate diagnostic errors related to system issues? Despite great hope for reducing diagnostic errors related to system-dependent factors, it will not be possible to totally eliminate system-related errors. At least four factors support this conclusion:
* System improvements degrade over time. Permanent improvement remains the ideal, but new policies are eventually forgotten, organizational changes vary with the assigned staff, and the enthusiasm for improvement wanes.
* The “fix,” though correcting the old problem, may introduce entirely new opportunities for error.
* Systems must necessarily evolve in step with the evolution of health care technology and management. Systems will develop and mature along a learning curve, and system repairs will always, to some extent, lag behind and fall short of achieving perfection.
* Whereas the ideal fix would eliminate the chance for errors, in practice we often encounter tradeoffs: the opportunity for errors is reduced in one system, but increased in another. A current example is the movement to limit the number of hours resident trainees can work without rest or sleep. Although limits on work hours may decrease errors related to fatigue, new problems may arise from the inevitable hand-offs created by new coverage systems and new problems of coordinating care when the physician who knows the patient best is now present only eight to 12 hours of the 24-hour work day.35,36 A related weakness has been pointed out by Perrow:
Fixes, including safety devices, sometimes create new accidents, and quite often merely allow those in charge to run the system faster, or in worse weather, or with bigger explosives.37
Signal detection theory illustrates the inevitability of tradeoffs. Originally introduced to describe the perceptual performance of radar operators, signal detection theory can describe any situation in which a yes—no decision is made. For example, the radiologist must decide whether a chest X-ray shows a tumor or is normal. Problems arise, however, for tumors that are difficult to appreciate, and when chance confluences of normal shadows can simulate a tumor when none is there. The radiologist must choose a threshold beyond which the report will indicate a tumor is present. With a lower threshold, the radiologist will have a higher sensitivity in detecting tumors, but at the expense of more false positives. With a more stringent threshold, false alarms will be minimized, but at the expense of missing some tumors.
Health system planners need to seek the right balance point, one that reduces errors optimally across the whole system, taking into account the tradeoffs that inevitably arise. Consider the need for communication between the emergency room and the admitting ward: Too little communication risks poor coordination of care, and too much would bog down both areas, leading to other errors. Physician “alerts” of abnormal labs are another example: If there are too few “alerts” some lab abnormalities may be missed, but if there are too many, the “alert” loses its functional significance. The difficulty of optimally allocating medical resources is a final example: Fixing a system problem in one area should not consume so many resources that other areas become vulnerable. The existence of errors may sometimes indicate a need for change, or may just suggest the need to re-evaluate the balance point of tradeoffs that were used to set the initial policy.
Perception. Diagnosis begins with perception. The physician must identify the physical findings, the pathologist must recognize the abnormalities of histology, and the radiologist must perceive the differences between normal and abnormal densities. The available evidence suggests that even in this first stage of the diagnostic process, seemingly the simplest and most straightforward, errors occur at non-trivial rates. For example, Berlin has summarized the studies of diagnostic errors in radiology, finding rates that range from 4% in clinical practice series (large numbers of normal films) to 30% in prospective studies incorporating larger numbers of abnormal films.38 In 80% of these cases, abnormal details were not perceived, and in 20%, abnormal details were identified but misinterpreted.39 Expertise in the type of visual diagnosis used by radiologists seems to involve an interaction of two distinct components: visual perception and domain-specific knowledge of both the normal expected findings and all possible abnormal findings.40 Variability in the physical examination is a related problem that detracts from the accuracy of diagnosis.41
Hypothesis generation. The initial information base often cues strongly matching candidate diagnoses, or a likely clinical framework for analysis. If this process occurs without much deliberate effort, other reasonable possibilities may not be similarly generated. An example is the patient who presents to the emergency department with chest pain from a dissecting aneurysm. The clinician may mistakenly conclude that the patient has pain related to myocardial ischemia and miss the diagnosis of dissecting aneurysm because myocardial ischemia is much more common, and therefore more “available” in memory.
Medical diagnosis is a specialized example of decision making under uncertainty. In familiar contexts, clinicians make decisions without much conscious deliberation, and medical experts routinely practice in this fashion. Clinicians typically use a variety of heuristics, or rules of thumb, for efficiently arriving at decisions in the face of limited time or data.42,43,44 For example, diagnoses are established using heuristics based on representativeness, availability, or extrapolation. The power of heuristics is enormous, allowing clinicians to navigate the diagnostic challenges of everyday life and make effective decisions, usually accurate, in real time when arduous working out of probabilities is not possible. Heuristic solutions free up cognitive resources so that they can be applied toward other demands. The price for using these powerful tools, however, is predictable error reflecting the inherent biases associated with each of these heuristics.42
Data interpretation. The probability of the initial hypothesis is adjusted upwards or downwards using test results to calculate a new probability using Bayes' theorem.19 Unfortunately, few clinicians are skilled in using Bayes' theorem, and in practice it is probably more common for tests to be interpreted without taking into account the characteristics (sensitivity and specificity) of the test itself.
Verification. Clinicians, like all decision makers, strongly favor their initial hypotheses and often stop searching for additional possibilities. This tendency leads to a number of cognitive errors collectively referred to as premature closure.45 This includes factors such as overconfidence and confirmation bias, which foster the tendency to favor confirming evidence over counter-evidence that might exclude the diagnosis.19 Another problem is posed by the patient with two or more medical disorders: Clinicians from their first days of training are exhorted to search for unifying explanations of a patient's multiple symptoms. Occham's razor specifically instructs the clinician that it is more likely for a single disease to explain multiple symptoms than it is for multiple diseases to do so. The test of time has confirmed the wisdom of this chestnut, but it will obviously lead to diagnostic errors in those patients who truly have more than one active process.
Can we reduce cognitive errors? In selecting the title for their seminal work on medical errors, the Institute of Medicine identified the essential conundrum we face. To Err is Human captures a feeling of inevitability. The thought suggests that we can never eliminate errors as long as we are human. Is it possible to improve perception, memory, or decision making? In this section we consider this question from two perspectives. First we address efforts to directly improve cognition. We then consider an alternative, indirect approach, attempting to improve diagnostic accuracy using a systems-related approach.
A. Improving cognition directly
Can we train better thinkers? Bordage argues that we can begin to train better diagnosticians by improving the quality of training in physical diagnosis.9 By teaching discriminative skills and by providing more examples and repetition, students can improve their clinical decision-making skills.
Can we learn to avoid biased judgment? Biased judgment is common in clinical reasoning.35 There are some isolated reports of success in reducing the likelihood of bias. Larson et al., for example, found that the diagnoses made by medical trainees instructed about the pitfalls of premature closure were more accurate than those made by peers who were not similarly trained.4 Short-term success was also identified after students completed a course in statistics, showing improved ability of the students to analyze statistical problems. In contrast, similar transference could not be demonstrated for courses in logic,46 and Regehr and Norman have concluded that, in general, effective ways to teach students how to avoid the pitfalls of using heuristics have yet to be identified.10
Even with highly specialized training, physicians are limited by a cognitive system that has evolved to be good at certain kinds of tasks and that faces predictable pitfalls. In the domain of hypothesis generation and problem solving, human cognition has evolved mechanisms that allow for efficiency at the expense of various forms of creativity. For example, people sometimes fail to notice analogies to prior experience. In a study by Gick and Holyoak, for example, people were first shown a solution to a military problem that involved marching several platoons simultaneously down multiple roads that led to an attack site so that they all arrived at once. When presented, minutes later, with a problem in which they needed to find a way to irradiate a tumor without damaging surrounding tissue, they failed to notice the analogy.47
Problem-based learning. Medical educators have tried to improve diagnostic reasoning through innovative curriculum changes that emphasize skills in clinical reasoning. The best-studied example has been the change to problem-based learning at many leading institutions. For the major part of the past century, medical knowledge was taught to students one subject at a time, proceeding from basic subjects such as anatomy and biochemistry in the first year to more advanced subjects such as pathophysiology in the second year. In this framework, medical decision making was not taught as a separate course, but evolved as the student's knowledge base expanded and the opportunity for clinical contact with patients increased in the third and fourth years. In contrast, the problem-based learning approach exposes students to clinical problems from the very start. A basic hypothesis of this approach is that clinical decision making involves specific skills and that these can be learned and applied effectively to novel clinical situations, independent of learning the subject matter (anatomy, biochemistry, etc.) first. The impact of problem-based learning was critically reviewed recently by Colliver.48 Only three studies of problem-based learning were identified that used random assignment of medical students to training condition, and none revealed any clear advantage of problem-based learning in terms of the scores on standardized exams or overall performances. In studies involving self-selection into a problem-based curriculum, however, there was evidence for a positive impact: At Rush Medical College at the end of a seven-month exposure to problem-based learning, students gave more accurate diagnoses and provided reasoning chains that were more comprehensive and elaborated compared with students in the standard curriculum.49 Similarly, students in the problem-based learning track at Bowman Gray School of Medicine of Wake Forest University had board scores comparable to those of students in the standard track, but scored better on scales assessing factual knowledge, ability to perform the history and physical exam, deriving a differential diagnosis, and organizing and expressing information. Elstein,15 Norman,50,51 and others52 have advanced the concept of “content specificity” to explain the general failure of problem-based learning to produce superior diagnosticians: Experts have knowledge that is more extensive and better integrated.
Metacognitive training. Ultimately, accurate diagnosis requires not only extensive domain-specific knowledge and sophisticated skills in clinical reasoning, but also the ability to be actively aware of how effectively one is thinking.51,53 Baron describes one application of this process, the ability to preserve active open-mindedness:
Good decision making involves sufficient search for possibilities, evidence, and goals, and fairness in the search for evidence and in the use of evidence.… People typically depart from this model by favoring possibilities that are already strong. We must make an effort to counteract this bias by looking actively for other possibilities and evidence… actively open-minded thinking.19
Baron goes on to present evidence that individuals who practice open-minded thinking show reduced tendencies to have cognitive biases and produce “better” decisions.
Clinical educators adopt this approach when they emphasize to their students the importance of deriving a complete differential diagnosis. This ensures that multiple possibilities are considered, at least at the outset of the search for the diagnosis. A simple strategy that may also work to promote actively open-minded thinking is the “crystal ball experience,” a device used by military trainers to promote open-minded thinking in military planning.54,55 In this exercise, the participants are asked to devise a plan to achieve a particular objective. After presenting the plan they are told that, according to the crystal ball which can foresee the future, this plan won't work. What plan would they suggest instead?54 This promotes the active examination of flaws in the original plan and a search for alternatives. All good bridge players know this strategy, but too few clinicians do.
B. Adopting systems solutions to cognitive errors
Independent of our ability to improve cognition directly, it should be possible to improve diagnostic accuracy indirectly by a systems-level approach. Changes at the systems level can compensate for the predictable patterns of thought that lead to error.
Improving perception. The way data are presented can have a profound influence on how easily abnormalities are detected. Highlighting abnormal laboratory test results makes them easier to detect when they are presented in a list with many normal test values. Similarly, perception is enhanced by graphic presentations and presentations that facilitate recognition of trends. The future holds great promise for similar advances in enhancing perception using technologic advances that incorporate such principles of human-factors engineering. An alternative approach is to supplement human perception using computer-aided diagnosis. Jiang et al. recently demonstrated the potential to improve the radiologic detection of cancer in reading mammograms: Computer-aided diagnosis improved both the sensitivity and the specificity of this test more than did independent readings by two radiologists.56
Availability of expertise. Emergency department physicians misinterpret 1–16% of plain radiographs and up to 35% of cranial tomographic studies.57 A direct approach to this problem would be to better train these non-radiologists in radiographic interpretation. The indirect, systems-level approach would be to ensure that trained radiologists are available to help interpret these studies, the approach endorsed by the American College of Radiology. Supporting the systems-level approach, Espinosa and Nolan found that the rate of missed findings on X-rays taken in an emergency department was reduced from 1.2% to 0.3% by requiring second reviews of each study by radiologists.58 Unfortunately, only 20% of U.S. hospitals have radiology staff present 24 hours a day. Alternative approaches include using tele-radiology to assist front-line clinicians, or on-site radiology trainees. The relative impacts of these three interventions are still being evaluated.57
Second opinions. Second opinions have proven to be a valuable strategy for reducing medication errors. The Institute for Safe Medication Practices endorses the use of second checks for complex or risk-prone medication requests such as parenteral nutrition, chemotherapy, or neonatal therapeutics. This same approach might similarly be used to reduce diagnostic errors. Kronz and coworkers studied the potential benefit of this idea by requiring a second opinion on every surgical pathology diagnosis referred to the Johns Hopkins Hospital over a 21-month period.59 The second opinions led to clinically relevant changes of the diagnoses in 1.4% of 6,171 cases. In selected types of cases, the corrected error rates were even higher: 5.1% in tissue from the female reproductive tract, and 9.5% in serosal samples.
Clinical guidelines and clinical decision-support systems. Given the documented variability in the extents to which clinicians act coherently and in accord with the laws of probability, it is reasonable to hope that clinical practice guidelines could reduce the rate of diagnostic errors. Guidelines standardize the approaches to clinical problems and minimize the variability in response patterns. To the extent that they incorporate appropriate base rates of disease, apply correct probability estimates, and minimize the errors induced by the use of heuristics, guidelines should in theory improve the accuracy of diagnosis and management.60 Guidelines, however, are themselves heuristics, and so they cannot provide a full solution. In addition, clinicians in practice are sometimes unaware of guidelines relevant to their patients, and even when they are aware, often do not follow the guidelines appropriately. Ionnidis and Lau recently provided an evidence-based review of the efficacy of guidelines and other interventions designed to reduce medical errors.61 Although guidelines reduced errors in treatment and prevention settings, this was not the case when guidelines were used to reduce diagnostic errors. In the only methodologically acceptable study they identified regarding diagnostic errors, the use of a clinical guideline actually increased the rate of missed radiologic fractures in an emergency department.62
Probably the most promising approach to improving diagnostic accuracy is to incorporate decision aids directly into the active practice of medicine using computer-assisted “expert systems.” Examples where clinical decision-support systems excel are numerous, including improved compliance with guidelines, improved antibiotic utilization, and improved use of preventive health measures.63 Hunt et al. recently provided an evidence-based review of clinical decision-support systems, identifying 68 published evaluations that reported patient outcomes or changes in provider behaviors. Studies were grouped into four categories: drug dosing, preventive care, clinical diagnosis, and general process of care. Overall, two thirds of the studies showed positive impacts on patient outcomes or provider behaviors, but, interestingly, none of the four studies of diagnosis (three on abdominal or chest pain, one on pediatric primary care problems) were positive.64 The reasons diagnostic processes were not improved in these studies are not clear, but there are both theoretical and practical limitations on the input data set that might limit the functionality of expert systems. The quality of the input data set is critical in determining the quality of the hypotheses the expert system will generate. In practice, clinicians may not have the time to input all of the data needed. A greater concern, identified by Regehr and Norman, is that the input data set may be biased by any initial hypotheses that were being entertained.10 As emphasized by Bordage, we tend to see what we are looking for.9 Thus, cognitive shortcomings can undermine potential improvements from system changes.
Can we eliminate cognitive errors? The first insurmountable barrier in eliminating cognitive errors is the impossibility of knowing all of the relevant facts all of the time. We discussed this issue earlier, and classified it as a no-fault error in the sense that much of this uncertainty lies outside the diagnostician's control.
Even if we could know all the relevant facts, we would be unable to process them quickly. The limitations of human information processing have been described in Simon's “principle of bounded rationality.”65 According to Simon,
Because of the limits on their computing speeds and power, intelligent systems must use approximate methods. Optimality is beyond their capabilities; their rationality is bounded…. Human short-term memory can hold only a half dozen chunks, an act of recognition takes nearly a second, and the simplest human reactions are measured in tens and hundreds of milliseconds, rather than microseconds, nanoseconds, or picoseconds.
A direct consequence of bounded rationality is the inevitability of diagnostic errors: “When intelligence explores unfamiliar domains, it falls back on ‘weak methods,’ which are independent of domain knowledge. People ‘satisfice’—look for good-enough solutions—instead of hopelessly searching for the best.”65
This applies to the use of heuristics, which are enormously powerful, yet inherently flawed. James Reason, the British psychologist who has studied human error extensively, summarizes the problem perfectly: “Our propensity for certain types of error is the price we pay for the brain's remarkable ability to think and act intuitively.”25 Heuristics play the odds: Sometimes, particularly under unusual circumstances, these rules of thumb lead to wrong decisions. The theme of tradeoffs, which comes up so often in the context of improving decision making, is especially appropriate to the use of heuristics, which inherently involves the tradeoff between efficiency and accuracy. Finally, the heuristics themselves, and for that matter any cognitive process contributing to decision making, can be misapplied, particularly if we are fatigued, distracted, stressed, or unfamiliar with the condition in question.
Studying some of the major industrial catastrophes of our time, Perrow identified an additional argument for the inevitability of diagnostic errors. In his “normal accident theory,” he concludes that systems that are tightly coupled, with complex and hidden interactions, will inevitably produce accidents.35 Although Perrow did not study medical mishaps, we propose that medical diagnosis involves analogous processes that are hidden and complex. The diagnostic process is very much a black box, in which the physician applies imperfect knowledge to somehow make sense out of a case presentation and lab data that are typically nonspecific and incomplete. Moreover, it is probably rare that the physician applies an appropriate amount of strategic mental monitoring to make sure all these cognitive processes are accurate and appropriate. In this framework, diagnostic errors may be as inevitable as those that befall nuclear power plants, air-craft, refineries, and mines.
As the second leading cause of medical error, diagnostic error is a major health care concern and worthy of much more attention than it has received. The sentinel-event registry established by the Joint Commission on Accreditation of Healthcare Organizations (JCAHO) does not even track diagnostic error as a category.66 Likewise, of the 79 practices recently evaluated by the Agency for Healthcare Research and Quality that might decrease medical errors, only two directly or indirectly deal with diagnostic errors.67 Diagnostic error has probably remained in the background of the current dialogue on medical error because the causes are more subtle and the solutions are less obvious than they are for problems such as medication error and wrong-site surgery.
Another reason to focus on reducing diagnostic error is that, in contrast to other types of medical errors, there is little opportunity to minimize the impact on the patient once the error is made. The emphasis must be on preventing the error in the first place.
The potential for reducing or eliminating diagnostic errors in each of the three main categories (no-fault, system-related, cognitive) is summarized in Table 2. With regard to reducing diagnostic errors, there is clearly reason to be optimistic: In each of the three categories the potential to reduce diagnostic errors is both real and achievable. Even nofault errors are likely to diminish over time as we are increasingly able to detect diseases in preclinical stages and illnesses with atypical presentations.
Experts uniformly advocate a focus on identifying and repairing latent system flaws as the most productive approach to improving the safety of medical care,2 and our analysis suggests this approach would also reduce diagnostic errors. Latent flaws in the health care system that contribute to diagnostic errors should be studied and addressed. Areas that should be targeted for intervention include supervision of trainees, availability of expertise, coordination of care, communication procedures, training and orientation, quality and availability of tests, suboptimal thinking environments (those producing undue stress, fatigue, distractions, excessive workload), and inefficient processes that lead to delays in diagnosis.
In contrast, the possibility that we could reduce diagnostic errors by focusing on cognitive elements has remained largely unexplored. This may reflect the general sentiment of the current performance improvement literature, which presents system-level approaches as the preferred route to achieve organizational excellence, as opposed to approaches to change how people think. This skepticism regarding cognitive interventions was recently captured by Nolan: “Although we cannot change the aspects of human cognition that cause us to err, we can design systems that reduce errors and make them safer for patients.”31 We would like to make the case that there may be substantial potential for improving the cognitive component of medical diagnosis, and propose that this should be a major research focus to improve patient safety. There are many potential avenues of exploration: training to improve metacognition, courses on diagnostic reasoning and pitfalls, and second-generation problem-based learning approaches that build on the lessons learned from the successful first-generation programs.52
Even if we are not yet able to improve cognitive performance per se, it may be possible to apply systems solutions to problem-prone cognitive tasks such as perception or differential diagnosis. Examples include mandated “second readings,” enhanced availability of subject-matter experts, improved supervision of trainees, and developing more effective guidelines and clinical decision-support systems. We acknowledge, however, that virtually none of the approaches outlined have been validated in practice, and several have little more than anecdotal support from fields outside medicine. The field is just being defined, and the opportunity for improvement in this direction is large.
In thinking about how to reduce diagnostic errors, a recurring theme is the problem of tradeoffs. We can be more certain, but at a price, be it dollars, effort, or time (Figure 1). Where should we set the bar? We can keep an open mind, we can perform a more extended search for possibilities, we can be more careful in interpreting and assessing data. At some point, however, these strategies become too expensive, or even self-defeating. For example, we can attempt to rule out every possibility in a given case, but this can lead to excessive testing, with its own risks and costs to the patient, and also the likelihood that on occasion a false-positive test result will lead us astray. Increased monitoring of cognition would also systematically increase delays in reaching a diagnosis, as extra tests and treatments are ordered to increase certainty. Decision making slows if we superimpose cognitive or systemic checks and balances—can we afford this?
At least in some cases the answer is, surprisingly, “Yes.” For example, mandatory second opinions before certain types of elective surgery reduce the number of unnecessary operations, saving two to four dollars for every dollar spent.68 Is it possible that programs that mandate second readings in pathology or radiology might realize similar savings? The possibility that optimizing diagnostic accuracy might actually be the most efficient approach is not a new concept: Consider how easy it is for the expert to solve a problem, compared with the labor-intensive approach taken by the novice25 (Figure 1). Although we shouldn't forget the years of training needed to develop expertise, once it exists, we should design our systems to take advantage of experts' skills, to achieve highly accurate diagnoses with the least effort and cost.
Whereas the potential to reduce diagnostic errors is real and substantial, we conclude that eliminating such errors is not a realistic goal. There are fundamental and insurmountable factors that preclude this possibility. We should focus instead on increasing the visibility of diagnostic errors in the current patient safety dialogue, prioritizing opportunities to reduce diagnostic errors wherever we can, and setting research priorities to study potential avenues for improving diagnostic accuracy in the future, such as interventions aimed at improved cognition.
1. Berwick DM. Taking action to improve safety: how to increase the odds of success. In: Enhancing Patient Safety and Reducing Errors in Health Care. Chicago, IL: National Patient Safety Foundation, 1999:1–11.
2. Institute of Medicine. To Err is Human; Building a Safer Health System. Washington DC: National Academy Press, 1999.
3. Arkes H. Why medical errors can't be eliminated: uncertainties and the hindsight bias. Chron Higher Educ. May 19, 2000.
4. Sprenger GM. Dare to tell it all. 2001. Let's Talk. Communicating Risk and Safety in Health Care. Plenery Lecture, delivered at the third Annenberg Conference on Enhancing Patient Safety, May 17, 2001, St. Paul, MN.
5. Leape L, Brennan TA, Laird N, et al. The nature of adverse events in hospitalized patients. Results of the Harvard Medical Practice Study II. N Engl J Med. 1991;324:377–84.
6. Bartlett EE. Physicians' cognitive errors and their liability consequences. J Healthcare Risk Manage. Fall 1998:62–9.
7. Tai DYH, El-Bilbeisi H, Tewari S, Mascha EJ, Wiedermann HP, Arroliga AC. A study of consecutive autopsies in a medical ICU: a comparison of clinical cause of death and autopsy diagnosis. Chest. 2001;119:530–6.
8. Kassirer JP. Diagnostic reasoning. Ann Intern Med. 1989;110:893–900.
9. Bordage G. Why did I miss the diagnosis? Some cognitive explanations and educational implications. Acad Med. 1999;74(10 suppl):S138–S143.
10. Regehr G, Norman GR. Issues in cognitive psychology: implications for professional education. Acad Med. 1996;71:988–1001.
11. Patel VL, Arocha JF, Kaufman DR. A primer on aspects of cognition for medical informatics. J Am Med Informat Assoc. 2001;8:324–43.
12. Norman GR. The epistemology of clinical reasoning: perspectives from philosophy, psychology, and neuroscience. Acad Med. 2000;75(10 suppl):S127–S136.
13. Kassirer JP, Kopelman RI. Cognitive errors in diagnosis: instantiation, classification, and consequences. Am J Med. 1989;86:433–41.
14. Schmidt HG, Norman GR, Boshuizen HPA. A cognitive perspective on medical expertise: theory and implications. Acad Med. 1990;65:611–21.
15. Elstein AS. Clinical reasoning in medicine. In: Higgs J, Jones M (eds). Clinical Reasoning in the Health Professions. Oxford, England: Butterworth-Heinemann, 1995:49–59.
16. Edlow JA, Caplan LR. Avoiding pitfalls in the diagnosis of subarachnoid hemorrhage. N Engl J Med. 2000;342:29–35.
17. Amendo MT, Brown BA, Kossow LB, Weinberg GM. Headache as the sole presentation of acute myocardial infarction in two elderly patients. Am J Geriatr Cardiol. 2001;10:100–1.
18. Saint S, Go AS, Frances C, Tierney LM Jr. Case records of the Massachusetts General Hospital—a home court advantage? N Engl J Med. 1995;333:883–4.
19. Baron J. Thinking and Deciding. 3rd ed. Cambridge, U.K.: Cambridge University Press, 2000.
20. Gross PR, Levitt N, Lewis MW. The flight from science and reason. Ann NY Acad Sci. 1996.
21. Kassirer JP, Kopelman RI. Learning Clinical Reasoning. Baltimore, MD: Williams and Wilkins, 1991.
22. Gorovitz S, Macintyre A. Toward a theory of medical fallibility. Hastings Center Rep. 1975;5:13–23.
23. Gawande A. Final cut. Medical arrogance and the decline of the autopsy. The New Yorker. March 19, 2001:94–9.
24. Kennedy M. Inexact sciences: professional education and the development of expertise. Review of Research in Education. 1987;14:133–68.
25. Reason J. Human Error. Cambridge, U.K.: Cambridge University Press, 1990.
26. Cook RI, Woods DD. Operating at the sharp end: the complexity of human error. In: Bogner MS (ed). Human Error in Medicine. Hillsdale, NJ: Lawrence Erlbaum Associates, 1994:255–310.
27. Reason J. Managing the Risks of Organizational Accidents. Brookfield, VT: Ashgate, 1997.
28. Leape L, Lawthers AG, Brennan TA, et al. Preventing medical injury. Qual Rev Bull. 1993;19:144–9.
29. Leape LL. The preventability of medical injury. In: Bogner MS (ed). Human Error in Medicine. Hillsdale, NJ: Lawrence Erlbaum Associates, 1994:13–26.
30. Moray N. Error reduction as a system problem. In: Bogner MS (ed). Human Error in Medicine. Hillsdale, NJ: Lawrence Erlbaum Associates, 1994:67–92.
31. Nolan TW. System changes to improve patient safety. BMJ. 2000;320:771–3.
32. Bogner MS. Human error in medicine: a frontier for change. In: Bogner MS (ed). Human Error in Medicine. Hillsdale, NJ: Lawrence Erlbaum Associates, 1994:373–83.
33. Spath PL. Reducing errors through work system improvements. In: Spath P (ed). Error Reduction in Health Care: A Systems Approach to Improving Patient Safety. San Francisco, CA: Jossey-Bass, 2000:199–234.
34. Reiser SJ. Medicine and the Reign of Technology. Cambridge, U.K.: Cambridge University Press, 1978.
35. Jauhar S. When rules for better care exact their own cost. The New York Times. Jan 5, 1999.
36. Peterson LA, Brennan TA, O'Neil AC, Cook EF, Lee TH. Does housestaff discontinuity of care increase the risk for preventable adverse events? Ann Intern Med. 1994;121:866–72.
37. Perrow C. Normal Accidents: Living with High-Risk Technologies. Princeton, NJ: Princeton University Press, 1999.
38. Berlin L. Defending the “missed” radiographic diagnosis. Am J Radiol. 2001;176:317–22.
39. Berlin L, Hendrix RW. Perceptual errors and negligence. Am J Radiol. 1998;170:863–7.
40. Norman GR, Coblentz CL, Brooks LR, Babcook CJ. Expertise in visual diagnosis: a review of the literature. Acad Med. 1992;67(10 suppl):S78–S83.
41. Sackett DL. A primer on the precision and accuracy of the clinical examination. JAMA. 1992;267:2638–44.
42. Kahneman D, Slovic P, Tversky A. Judgement Under Uncertainty: Heuristics and Biases. Cambridge, U.K.: Cambridge University Press, 1982.
43. Elstein AS. Heuristics and biases: selected errors in clinical reasoning. Acad Med. 1999;74:791–4.
44. Dawson NV, Arkes HR. Systematic errors in medical decision making: judgment limitations. J Gen Intern Med. 1987;2:183–7.
45. Voytovich AE, Rippey RM, Suffredini A. Premature closure in diagnostic reasoning. J Med Educ. 1985;60:302–7.
46. Cheng PW, Holyoak KJ, Nisbett RE, Oliver LM. Pragmatic versus syntactic approaches to training deductive reasoning. Cogn Psychol. 1986;18:293–328.
47. Gick ML, Holyoak K. Schema induction and analogical transfer. Cogn Psychol. 1983;15:1–38.
48. Tversky A, Kahneman D. Availability: a heuristic for judging frequency and probability. Cogn Psychol. 1973;5:207–32.
49. Hmelo CE. Cognitive consequences of problem-based learning for the early development of medical expertise. Teach Learn Med. 1998;10:92–100.
50. Eva KW, Neville AJ, Norman GR. Exploring the etiology of content specificity: factors influencing analogic transfer and problem solving. Acad Med. 1998;73(10 suppl):S1–S5.
51. Norman GR. Problem-solving skills, solving problems, and problem-based learning. Med Educ. 1988;22:279–86.
52. Perkins DL, Salomon G. Are cognitive skills context-bound? Education Research. 1989;18:16–25.
53. Higgs J, Jones M. Clinical reasoning. In: Higgs J, Jones M (eds). Clinical Reasoning in the Health Professions. Oxford, U.K.: Butterworth-Heinemann, 1995:3–23.
54. Mitchell DJ, Russo JE, Pennington N. Back to the future: temporal perspective in the explanation of events. J Behav Decis Making. 1989;2:25–38.
55. Klein G. Sources of Power: How People Make Decisions. Cambridge, MA: The MIT Press, 1998.
56. Jiang Y, Nishikawa RM, Schmidt RA, Metz CE, Doi K. Relative gains in diagnostic accuracy between computer-aided diagnosis and independent double reading. In: Krupinski EA. Medical Imaging 2000: Image Perception and Performance (Proceedings of SPIE, vol. 3981, 2000). Progress in Biomedial Optics and Imaging. 2000;1:10–5.
57. Kripalani S, Williams MV, Rask K. Reducing errors in the interpretation of plain radiographs and computed tomography scans. In: Shojania KG, Duncan BW, McDonald KM, Wachter RM (eds). Making Health Care Safer. A Critical Analysis of Patient Safety Practices. Rockville, MD: Agency for Healthcare Research and Quality, 2001.
58. Espinosa JA, Nolan TW. Reducing errors made by emergency physicians in interpreting radiographs; longitudinal study. BMJ. 2000;320:737–40.
59. Kronz JD, Westra WH, Epstein JI. Mandatory second opinion surgical pathology at a large referral hospital. Cancer. 1999;86:2426–35.
60. Garfield FB, Garfield JM. Clinical judgement and clinical practice guidelines. Int J Technol Assess in Health Care. 2000;16:1050–60.
61. Ioannides JPA, Lau J. Evidence on interventions to reduce medical errors. An overview and recommendations for future research. J Gen Intern Med. 2001;16:325–34.
62. Klassen TP, Ropp LJ, Sutcliffe T, et al. A randomized controlled trial of radiograph ordering for extremity trauma in a pediatric emergency department. Ann Emerg Med. 1993;22:1524–9.
63. Trowbridge R, Weingarten S. Clinical decision support systems. In: Shojania KG, Duncan BW, McDonald KM, Wachter RM (eds). Making Health Care Safer. A Critical Analysis of Patient Safety Practices. Rockville, MD: Agency for Healthcare Research and Quality, 2001.
64. Hunt DL, Haynes RB, Hanna SE, Smith K. Effects of computer-based clinical decision support systems on physician performance and patient outcomes: a systematic review. JAMA. 1998;283:2816–21.
65. Simon HA. Invariants of human behavior. Annu Rev Psychol. 1990;41:1–19.
66. Sentinel event statistics. Joint Commission on Accreditation of Healthcare Organizations. 〈http://www.jcaho.org
〉. Accessed 5/3/02.
67. Shojania KG, Duncan BW, McDonald CJ, Wachter RM. Making Health Care Safer: A Critical Analysis of Patient Safety Practices. Evidence Report/Technology Assessment #43; AHRQ Publication No 01- E058. Rockville, MD: Agency for Healthcare Research and Quality, 2001.
68. Ruchlin HS, Finkel ML, McCarthy EG. The efficiency of second-opinion consultation programs: a cost-benefit perspective. Med Care. 1982;20:3–19.