Imagine that a diagnostician is asked to comment on a diagnosis proposed by a colleague. Clearly, to decide whether or not the diagnosis is probably correct, other diagnostic possibilities for that case must be considered. However, the prevalence of confirmation biases recorded in the psychology literature suggests that the proposed diagnosis has some priority over self-generated diagnoses.1 Considering the proposed diagnosis first might lead to noticing and taking seriously the features consistent with it and evaluating other diagnoses in that light. The current study was designed to address this issue by examining differences in probability ratings and patient management decisions as a function of whether diagnostic alternatives are presented explicitly or are generated by the diagnosticians themselves. Normatively, there should be no difference in the probability assigned to, or the patient management decisions made on the basis of, a diagnostic alternative regardless of whether it was suggested by someone else or was self-generated. In fact, for at least some levels of expertise, the source or explicitness of the diagnosis might be important in how thoroughly it is considered.
It is well documented that the probability rating assigned to a particular diagnosis tends to be greater when that diagnosis is presented in isolation relative to when it is presented within a list of alternative diagnoses (the unpacking effect).2 For example, the rated probability that a person will die of cancer tends to be greater when cancer is considered by itself than when presented within a list of differential diagnoses. Our previous work has shown, some-what counter-intuitively, that the alternative diagnoses that have the greatest influence on the probability assigned to a focal diagnosis are those that are most likely to have been considered even in the absence of their explicit presentation.3 That is, the magnitude of the unpacking effect (i.e., the decrease in the probability assigned to the focal diagnosis) was greater when the unpacked alternatives were believed to be highly plausible by independent experts relative to when the unpacked alternatives were believed to be less likely. While this result suggests that participants under-appreciated diagnostic alternatives that they themselves generated relative to when the same alternatives were explicitly presented, the experimental design did not allow us to be certain that participants had actually considered the diagnoses that were rated as most likely. Five diagnoses were explicitly presented in the unpacked condition, thereby allowing the possibility that participants had not generated all of the plausible diagnoses while reading the case history.
The current study was designed to demonstrate the same result for alternatives that participants claimed to have actually considered. Furthermore, we attempted to maximize the probability that a specific alternative diagnosis would come to mind even when not presented explicitly by using clinical cases previously shown to have two highly likely and roughly equiprobable diagnoses.4 Both manipulations should eliminate the unpacking effect if diagnosticians evaluate diagnostic possibilities that they themselves generate in the same way as diagnoses that are explicitly provided.
While subjective estimates of probability are believed to provide a valid measure of participants' clinical decision-making processes, it is possible that the act of assigning probabilities is a formal exercise that is not closely related to actual practice. So, the current study also served as an attempt to demonstrate that the unpacking effect is not restricted to numerical estimates of probability by examining whether or not patient management strategies, such as the requesting of diagnostic tests, are influenced by the explicit presentation of diagnostic alternatives. That is, if diagnosticians request more tests upon being presented two highly diagnostic alternatives relative to being presented just the focal diagnosis, we would have converging, and perhaps more ecologically valid, evidence that there is a tendency to under-weight alternatives that are not explicitly provided. Redelmeier et al.5 have previously shown that the likelihood that fourth-year medical students will order a CT scan upon the presentation of a potential case of sinusitis was influenced by the number of alternative diagnoses that were explicitly mentioned. The current study attempts to further ensure the robustness and generalizability of their findings by using multiple cases and a more extreme manipulation.
We tested this design initially on medical students primarily because of the ease of obtaining them as participants, but we believe that this initial step provides data of interest. Numerous biasing studies in medicine have confirmed that both experts and novices tend to be susceptible to the same heuristic-induced errors.3,8,11 Understanding the mechanism underlying such processes might allow insight into the source of any errors that are made even as the number of errors a diagnostician makes undoubtedly decreases with the development of expertise.
Participants. The participant pool for this study consisted of second-year medical students from McMaster University's graduating class of 2001. A sample of tutorial leaders asked their students whether they would participate. Those who agreed were run through the experiment in their tutorial groups during two sessions separated in time by an average of eight days (range 4–14 days). Twenty students participated in four groups, but follow-up data could not be collected for one of the students, leaving 19 with complete sets of data. Upon completion of the second group session, participants were paid $20 and given feedback on both the clinical cases used and the purpose of the study.
Materials. Participants were presented ten case histories, each of which was followed by one or two diagnostic hypotheses and a series of five questions. (1) “Given the case history that you have just read, please assign a number between 0 and 100 indicating how likely you think it is that the case history is representative of the given diagnosis(es).” In all conditions participants were told that the diagnoses were mutually exclusive and that the inclusion of an “all other diagnoses” alternative meant that each list was exhaustive, thereby indicating that the sum of the ratings assigned should be 100%. (2) “Are there any diagnostic tests that you would like to see performed to aid you in your decision? If yes, please list them.” (3) “While reading the case history, did you consider any diagnosis apart from those listed above? If yes, please state the diagnosis that you consider to be the most likely differential.” (4) “Please rate your confidence (on a scale of 1 to 100) that you know the correct diagnosis.” (5) “Please rate the typicality of this case on a scale of 1 to 100.” The latter two questions were intended to serve as dummy variables that would increase the likelihood that participants would not remember the exact probability assigned to any particular question.
Procedure. Each of the ten cases has been shown to be suggestive of two diagnoses, both highly likely and roughly equiprobable diagnoses.5 One of each pair of diagnoses was randomly selected to be the focal diagnosis—the diagnostic alternative that would be presented with its associated case history across all conditions. In working through all ten cases, each subject was shown five cases within each condition (i.e., focal diagnosis alone versus focal + alternative diagnosis) randomly mixed together. Approximately one week later each participant was shown the same ten cases and asked to rate the original alternative(s) together with the alternative they had generated in response to question three. If no alternative had been generated, participants were simply shown the original alternatives a second time. Apart from adding the alternatives participants had generated during the first pass, the questionnaires used during the two sessions were identical.
In completing all ten cases, 190 observations were generated that could be analyzed for the unpacking effect—a decrease in the probability assigned to a focal diagnosis upon the explicit presentation of additional diagnoses. Table 1 presents the average probability assigned to the focal diagnosis as a function of condition. First, a 2 (session) × 2 (number of alternatives presented during pass 1) × 2 (diagnostic alternative: generated versus not generated) × 10 (case) ANOVA was performed. A significant effect of “number of alternatives” (F(1,156) = 12.365, p <.01) revealed that the probability assigned to the focal diagnosis was higher when presented in isolation than it was when presented in conjunction with a second diagnosis even though the alternative diagnosis was the most likely differential. An effect of “diagnostic alternative” was also found (F(1,156) = 6.009, p <.02), thereby indicating that participants rated the focal diagnosis as more likely when they did not generate a plausible alternative diagnosis, as indicated by their responses to question 3. Case was the only other effect that reached significance (F(9,156) = 4.746, p <.01).
To further demonstrate that the unpacking effect occurs even when the unpacked alternatives are diagnoses that the participants had already considered, we performed a 2 (session) × 2 (number of alternatives presented during pass 1) × 10 (case) ANOVA on only those observations in which a diagnostic alternative had been generated, that is, using only the data presented in the Alternative Generated column of Table 1. The main effect of “number of alternatives” persisted (F(1,139) = 8.861, p <.01). In addition, a main effect of session was found (F(1,139) = 16.375, p <.01), which indicates that the probability assigned to the focal diagnosis was lower in session 2 than in session 1 even though the only difference between the two sessions was the explicit presentation during session 2 of the diagnoses that the participants claimed to have considered implicitly during session 1. Case was, once again, the only other effect that achieved significance (F(9,139) = 4.323, p <.01). The effect of session was not observed when the same analysis was repeated for trials in which the participants did not generate a diagnostic alternative (i.e., using only the data presented the Alternative Not Generated column of Table 1). This indicates that the effect was not simply a result of the passage of time. The numbers of observations in these cells were small, but examination of the means suggests that, if anything, the probability assigned to the focal diagnosis increased in session 2 relative to session 1 if no diagnostic alternative had been generated during session 1 (F(1,17) = 0.017, p >.85).
We also examined whether or not the phenomenon being illustrated by the probability ratings might influence management strategies by asking our participants to list the diagnostic tests that they would be interested in seeing performed. Participants requested more tests when two diagnoses were presented (mean = 3.464) than when the focal diagnosis was presented in isolation (mean = 2.989; F(1,187) = 4.938, p <.05). This result suggests that the explicit presentation of diagnoses can influence the management strategies of diagnosticians in addition to altering their rating of another diagnosis's likelihood.
Finally, the effect of the number of alternatives presented on confidence ratings and typicality ratings were analyzed. No effect of session or “number of alternatives” was found for either of these two variables.
These results support the notion that individuals tend to under-appreciate self-generated diagnoses relative to diagnoses that are explicitly presented. The participants rated the originally presented diagnosis as less probable when the alternative they had claimed to be considering implicitly was provided in a more explicit manner. That is, the unpacking effect was found even when the diagnostic alternative that was unpacked was one that our participants claimed to have considered while originally viewing the case.
While differences in the probability ratings assigned to the focal diagnosis across condition might appear small relative to the 100-point scale used, it is important to note that the functional range of potential responses was actually substantially smaller than 100. As mentioned earlier, the cases were originally designed to be indicative of two diagnoses, which are both highly likely and roughly equiprobable. Consistent with that manipulation, our participants were heistant to assign a very high or a very low likelihood rating to any individual diagnosis. The effect size across packed versus unpacked versions of the questionnaire was 0.46—a medium-sized effect6—even though substantial effort was invested to ensure that the cards were stacked in favor of the null hypothesis.
That being said, the mechanism that causes individuals to under-weight alternatives that are not explicitly presented remains in question. As alluded to in the introduction, the effects observed might arise as a result of confirmation bias, as the explicit presentation of a diagnostic alternative might cause diagnosticians to differentially process the evidence relevant to the diagnostic possibilities. This could arise in at least two ways that are not necessarily exclusive of one another; the explicit presentation of a diagnostic hypothesis might influence both the search for and the construal of evidence. Support for the plausibility of these hypotheses is widespread.
For example, it has been found that, when given the opportunity to select additional information (i.e., prevalence data), medical students,9 residents,10 and physicians11 tend to seek data that are relevant to a single disease while ignoring information that is related to equally plausible differential diagnoses. This biased search for information need not be proactive in that it does not necessarily take place while the diagnostician gathers novel information. On the contrary, Anderson and Pichert have shown that retrieval of information from memory is also influenced by the context within which the search takes place.12 When asked to recall information about a house, the type of information participants were able to remember was dependent on whether they had been asked to read the story from the perspective of a burglar or a home buyer. When subjects were later asked to adopt the opposite perspective, they were able to recall more information that simply had not been available during the first memory task. A plausible extension of this result is that the explicit presentation of a diagnosis might bias the memorial retrieval of features present in the case history.
Furthermore, maintaining an initial focus on the diagnosis that is explicitly presented might make it difficult to realize that nondiscriminating symptoms provide evidence for more than one diagnostic alternative. For example, considering the nausea and vomiting with which an 18-year-old woman with right-lower-quadrant discomfort presents as indicative of appendicitis might blind an individual to the possibility that these symptoms can also be construed of as clinical manifestations of pelvic inflammatory disease. Norman, LeBlanc, and Brooks have provided evidence that supports this notion by reporting that the mere presentation of a diagnostic alternative can influence the interpretation of classic clinical features.7 Reinterpreting these features in light of self-generated diagnoses could prove to be difficult.
Regardless of their cause, the data presented here indicate that the meaning of the verb “to consider” should not necessarily be taken at face value. Having considered the plausibility of a diagnostic alternative can mean anything from having had the term come to mind to having performed a comprehensive analysis of the evidence for and against that particular diagnosis. Asking our participants to assign a probability rating to the likelihood of diagnoses that they claim to have considered was sufficient to decrease the probability that they were willing to assign to the focal diagnosis. This strongly suggests that the evidence in favor of the self-generated alternative was underappreciated relative to when attention was focused on that alternative explicitly. Further research is required to determine whether or not particular strategies can be adopted to prevent such under-weighting.
1. Reisberg D. Cognition: Exploring the Science of the Mind. New York: W. W. Norton & Company, 1997.
2. Tversky A, Koehler DJ. Support theory: a nonextensional representation of subjective probability. Psychol Rev. 1994;101:547–67.
3. Eva KW, Brooks LR, Cunnington JPW, Norman GR. The strength of alternatives: its role in diagnostic decision making and probability judgments. Unpublished paper.
4. Cunnington JPW, Turnbull JM, Regehr G, Marriott M, Norman GR. The effect of presentation order in clinical decision making. Acad Med. 1997;72(10 suppl 1):S40–S42.
5. Redelmeier DA, Koehler DJ, Liberman V, Tversky A. Probability judgment in medicine: discounting unspecified probabilities. Med Decis Making. 1995;15:227–31.
6. Cohen J. Statistical Power Analysis for the Social Sciences. 2nd ed. New York: Academic Press, 1977.
7. Norman GR, LeBlanc VR, Brooks LR. On the difficulty of noticing the obvious. Psychol Sci. In press.
8. Hatala R, Norman GR, Brooks LR. The impact of a clinical scenario upon accuracy of electrocardiogram interpretation. Unpublished paper.
9. Kern L, Doherty ME. ‘Pseudodiagnosticity’ in an idealized medical problem-solving environment. J Med Educ. 1982;57:100–4.
10. Wolf FM, Gruppen LD, Billi JE. Differential diagnosis and the competing-hypotheses heuristic: a practical approach to judgment under uncertainty and Bayesian probability. JAMA. 1985;253:2858–62.
11. Green LA, Yates JF. Influence of pseudodiagnostic information on the evaluation of ischemic heart disease. Ann Emerg Med. 1995;25:451–7.
12. Anderson RC, Pichert J. Recall of previously unrecallable information following a shift in perspective. J Verbal Learn Behav. 1978;17:1–12.
Research in Medical Education: Proceedings of the Thirty-ninth Annual Conference. October 30 - November 1, 2000. Chair: Beth Dawson. Editor: M. Brownell Anderson. Foreword by Beth Dawson, PhD.