Dr. Norman is assistant dean, Program for Educational Research and Development, McMaster University, Hamilton, Ontario, Canada.
Correspondence should be addressed to Dr. Norman, MDCL 3519, McMaster University, 200 Main St. W., Hamilton ON L8N 3Z5, Canada; telephone: (905) 525-9140, ext. 22119; e-mail: firstname.lastname@example.org.
Editor's Note: This is a commentary on Falzer PR, Garman DM. Contextual decision making and the implementation of clinical guidelines: An example from mental health. Acad Med. 2010;85:548-555; and Warner JL, Najarian RM, Tierney LM Jr. Perspective: Uses and misuses of thresholds in diagnostic decision making. Acad Med. 2010;85:556-563.
Two articles in this issue, “Contextual decision making and the implementation of clinical guidelines” by Falzer and Garman,1 and “Perspective: Uses and misuses of thresholds in diagnostic decision making” by Warner et al,2 demonstrate two very different approaches to related problems.
Falzer and Garman1 examine the issue of uptake of practice guidelines. They offer sound and insightful commentary on the current movement toward knowledge translation, practice guidelines, and evidence-based medicine:
There is a naïve belief, commonly expressed in the services and informatics literatures, that clinicians should function as transducers who take knowledge and implement it without friction or discontinuity. . . . If transduction is the ideal, then decision making becomes problematic whenever it introduces discretion and variety, or disrupts the smooth functioning of the continuum. In implementation science, a norm of correctness or consistency is established by evidence-based practices or clinical guidelines. The soundness of a decision can be gauged by how closely it adheres to the norm.
Although there is no doubt that, by one metric or another, current management practices are often suboptimal, it seems to me that advocates of knowledge translation, practice guidelines, and evidence-based medicine ignore some harsh realities in their single-minded quest for conformity. First, the fact that the recent history of practice guidelines, in which practitioners seem to consistently reject rigid adherence, can only mean one of two things: Either practitioners are negligent—or at least suboptimal—in their rejection of what should be considered the norm, or they have rational reasons to not adopt the guideline indiscriminately.
There is a case to be made for the latter. Guidelines arise from randomized controlled trials, and trials are typically designed around highly homogeneous and therefore atypical populations, so they are unlikely to apply to every patient who walks through the door. As Falzer and Garman correctly deduce, the challenge is to determine “when departing from the norm is clinically indicated and when it signifies suboptimal practice.” We should not decry deviations from guidelines as suboptimal care; instead we should, at least in some instances, celebrate these instances as evidence that practitioners are exercising sound judgment instead of acting as pharmaceutical automatons. Indeed, although there is evidence that older physicians are less likely than younger doctors to adhere to guidelines,3 one study showed that this was a consequence of the fact that “greater emphasis [was placed] by [older] consultants on holistic patient care, and what might be seen as their separate schemas for appropriate prescribing stemmed from that premise.” In contrast, junior doctors seemed to have had a single schema that encompassed both prescribing generally and appropriate prescribing.4
Falzer and Garman then go on to offer an alternative to the mechanistic “adherence” metric, to examine the relationship between adherence and mismatch with the conditions of the guideline. They show convincingly that departures from the normative standard are not random; when the case is a good fit with the standard, there is 90% adherence, but this drops to a low of about 30% as the mismatch increases.
There are some limitations, as acknowledged by the authors. It is a small and skewed sample of Yale psychiatry residents. However, if anything, this reinforces the findings, because they are asked to make decisions on paper cases with no competing demands or time constraints. More worrisome, there were large individual differences in endorsement frequency, ranging from 17% to 77%, suggesting either that some residents really are practicing suboptimal care (according to guidelines) or, alternatively, that some residents are not using sufficient judgment and are demonstrating blind adherence. Pick your poison.
The second paper, by Warner and colleagues,2 addresses a different and related phenomenon, the decision-making process from diagnosis to test ordering to therapeutic choice, through a more theoretical lens. The fundamental idea is that physicians make decisions when they exceed some internal threshold (or, more accurately, thresholds), an idea advanced by Pauker and Kassirer.5 These two thresholds, in turn, define three decision zones: no treatment or further action, further testing, and treatment.
Although this conceptualization is useful, one immediate concern is that it collapses all treatment decisions onto a single dimension of diagnostic uncertainty. There is no clear way to map the many factors included in the Falzer study onto the concept of threshold, except as error variation. Although these authors acknowledge that few situations exist in which such an unambiguous, unidimensional model applies, they nevertheless find themselves clinging to the one-dimensional belief. For example, they acknowledge differences between individual clinicians, but describe them as follows:
These implicit thresholds “belong” to the individual clinician. Unlike explicit thresholds, those that are implicit are not amenable to numeracy and are generally referred to as high or low.2
Whereas these thresholds may be dependent on the specific clinical situation—acknowledging issues such as seriousness and treatability—any departure from normative values constitutes error: “If one of the clinician's implicit thresholds is well away from the average, or consensus threshold, it should be considered to be miscalibrated.”2
The authors go on to explore the impact of variation in threshold. For example, if a threshold “is decreased so much that the diagnosis is presumptively accepted, the clinician has succumbed to the well-known phenomenon of premature closure, also known as representativeness.”2 However, this assessment does not take into account a number of complicating factors. First, as Wears and Nemath6 have noted, the judgment of premature closure is not at all straightforward. Second, “representativeness” is only one possible hypothesized mechanism to explain premature closure. Moreover, in any individual situation, it is virtually impossible to determine why a physician might have made a decision to terminate data gathering. The nature of cognitive biases is that, by definition, the decision maker is unaware of the bias.
Warner and colleagues go on to explore a specific hypothetical case and how it might evolve under different circumstances. The more I read of the evolving case and the various ways in which physicians may distort their thresholds, however, the more I find it difficult to accept that clinical reasoning can actually be represented by this idea of implicit thresholds. Can all physician actions in diagnosis be neatly summarized by conscious or unconscious manipulation of thresholds? In fact, there is no evidence that implicit thresholds have any psychological reality. Nor did Pauker and Kassirer make this claim; their intent was simply to offer this conceptualization as an explicit decision aid.
Perhaps Warner and colleagues intended to do the same. Perhaps the idea of implicit thresholds does provide a useful framework for teaching clinical decision making (and, indeed, Tierney is an internationally renowned clinical educator). But the history of decision aids in general and decision analysis in particular, after three decades of effort, is that they have had little impact on clinical decision making at the bedside. This disconnect may, to some degree, reflect that we can only improve decision making if we begin by understanding decision making, as noted by Falzer and Garman. If we do not understand how clinicians are thinking, we will continue to talk past them and will show little benefit for the large investment of intellectual capital made in these enterprises.
Finally, it is useful to revisit Falzer and Garman's reflections on the knowledge translation paradigm, which I quoted earlier. In a larger arena, both articles are addressing the issue of the departure of individual human behavior from some abstract normative standard. However, in one case, the fundamental perspective is optimistic, viewing the discrepancies as adaptive and seeking to understand better the reasons for the discrepancy. In the other, the prevalent view is pessimistic, viewing human decision makers as suboptimal, and seeking ways to move them closer to the theoretical ideal. Recent history tells us that such efforts are likely to fail unless more effort is made to understand why the departures exist, instead of simply presuming that these are one more manifestation of human frailty.
The author is funded in part by Canadian Institutes of Health Research and in part by Natural Sciences and Engineering Research Council of Canada.