Secondary Logo

Journal Logo

Does “Shortness of Breath” = “Dyspnea”?: The Biasing Effect of Feature Instantiation in Medical Diagnosis

EVA, KEVIN W.; BROOKS, LEE R.; NORMAN, GEOFFREY R.

Section Editor(s): Regehr, Glenn PhD

PAPERS: DX
Free

Correspondence: Kevin Eva, PhD, McMaster University, HSC/3N51B, 1200 Main Street West, Hamilton, Ontario, FGL8S 375, Canada.

The authors thank Glenn Regehr, Gordon Page, and an anonymous reviewer for their insightful comments regarding previous versions of this manuscript.

Discussion of clinical cases is an important component of medical education. Clinical instructors commonly present case histories to direct students' learning and use discussion of case histories to gauge students' understanding. The current study attempted to test the hypothesis that the language used to describe clinical cases will influence a diagnostician's construal of the information presented. Specifically, when clinical features are presented to a diagnostician using medicalese (i.e., technical descriptions that use medical terminology, such as dyspnea), the informational content they provide might be weighted more heavily in the decision-making process than when the same features are presented using lay terminology. The use of medicalese might increase the medical familiarity of a feature, thereby resulting in its being given more weight when a diagnostic decision is required.

The encoding specificity principle, identified by Tulving 1 while studying human memory, provides one explanation for why the use of medical terminology might produce such an effect. This principle states that items in memory will be better recalled when the context within which memory retrieval takes place matches the context within which learning occurred. Transfer-appropriate processing is an important related concept. 2 In discussing cases with colleagues or students, physicians regularly translate a patient's complaints from lay terminology into medicalese. Furthermore, textbook descriptions of diseases routinely use standard nomenclature. As a result, a new case will better match the context established during learning when medical language is used than when lay terminology is presented. In turn, features that are presented in medical terminology might act as more effective cues to trigger particular diagnoses and might be more memorable after a case has been evaluated.

Such a result would be important because it has been shown that differential weighting of a subset of the available features is instrumental in allowing diagnostic suggestions to bias diagnostic decision making. 3 The explicit presentation of a diagnosis appears to cause individuals to focus their attention differentially on features that are consistent with the suggested diagnosis, thereby making those features more salient. This, in turn, causes such features to be over-weighted during the decision-making process relative to features that are consistent with alternative hypotheses. The current project serves as a further inquiry into potential causes of attentional biases while concurrently examining the power of medical terminology to direct understanding of a clinical case.

In addition, this work can be seen as complementary to Bordage and Lemieux's proposal that semantic analyses performed on discourses about medical cases can reveal a clinician's mental representation of a patient's case history. 4 They found that students who used semantic qualifiers (e.g., acute, distal, sudden) when maintaining a clinical representation of a problem were more likely to accurately diagnose clinical cases than were those who represented clinical problems with more generic descriptions. Although substantial data have been presented in support of this correlation, it is not yet clear whether accuracy improves as a result of the use of these qualifiers or whether the better students are simply more able to describe the cases using medical nomenclature. If it is true that the use of semantic qualifiers can improve diagnostic accuracy, then manipulating the way in which the case history is presented should influence the diagnostic conclusions that are reached. That is, the more medical nomenclature is used to describe the features consistent with a particular diagnosis, the more likely it should be that a diagnostician will conclude in favor of that diagnosis.

Our hypothesis is that the use of medical terminology improves the match between the context at learning and the context at diagnosis. Medicalese is, therefore, expected to raise the mnemonic effectiveness of clinical features, thereby cueing diagnosticians to particular hypotheses and, in turn, increasing their confidence in those diagnoses.

Back to Top | Article Outline

Method

Participants. Eleven first- and second-year family medicine residents from the University of Toronto and three second-year residents from McMaster University participated in this study during two group sessions. Informed consent was obtained and the groups were given $30 for each resident that participated, to be distributed or used at the group's discretion. At the end of the experimental session all participants were given feedback regarding both the purpose of the study and the clinical cases used.

Materials. Six clinical cases were selected that maintained the following properties. First, these cases were developed so that two diagnoses could be viewed as probable. 5 Second, for each of the two diagnoses, at least three features were present within the case that could be manipulated for the present purpose. That is, at least six features could be presented using either medical or lay terminology. This was manipulated in one of four ways. (1) A lay description could be converted to medicalese (e.g., shortness of breath vs. dyspnea). (2) A semantic qualifier could be absent or present (e.g., chest pains vs. retrosternal chest pain). (3) An interpretation of clinical data could be provided (e.g., WBC = 12,000 vs. WBC elevated [12,000]). (4) The absence of particular symptoms could be left implicit or mentioned explicitly (e.g., breathing was normal vs. there has been no wheezing or hoarseness). Notice that Bordage and Lemieux argued only that the second of these four manipulations provides an indication of the problem representation maintained by clinicians. 4 Four manipulations were used, however, to increase the flexibility with which the features could be described in a more or less technical manner. In principle it would be interesting to know if any one of these transformations is more influential than the others, but we opted against manipulating all forms of medicalese independently for this initial investigation due to limitations in the number of cases and participants that were available.

Two versions of each case were created. In one version the features consistent with Diagnosis A were presented using lay terminology while the features consistent with Diagnosis B were presented using medicalese. In the other version the reverse was true: the features consistent with Diagnosis A were presented using medicalese while the features consistent with Diagnosis B were presented using lay terminology. A complete set of cases, as presented to participants, is available from the authors. 3 Each version of each case was presented to half of the participants. The features that were manipulated did not represent a comprehensive list of either the entire case history or all of the features that were intended to be indicative of Diagnosis A or Diagnosis B. On average, each case history consisted of 14 lines of text. There were an average of nine features in each case history that were indicative of either Diagnosis A or Diagnosis B. Of these nine, three features consistent with Diagnosis A were manipulated, three features consistent with Diagnosis B were manipulated, and the remaining features were held constant across both versions of the case history.

Procedure. Participants were free to work through the test book-let at their own pace. They were told, however, that they should not turn each page until they were committed to proceeding because they were not allowed to turn back to earlier pages. This instruction was given to ensure that (1) the participants read the case histories carefully before moving on to the questions specific to each case and (2) the numbers of tasks undertaken between reading the case and the memory test for that case were consistent across condition. After reading the first case history, participants turned the page and were asked, “Based on the information that you have just read, please list the two diagnoses that you feel to be most likely.” On the next page they were shown Diagnosis A and Diagnosis B and asked to rate the probability that the case they had just read was representative of these diagnoses. Participants were also given an “all other diagnoses” option and were told that because the patient had only one disorder their responses to these three options should sum to 100. They were then asked to rate their confidence in their probability ratings using a 100–point scale. This procedure was repeated for the remaining five cases.

After the sixth case, participants were given a memory test. They were told, for example, that “the first case that you were shown focused on a 48-year-old woman with epigastric abdominal discomfort.” They were asked to recall everything that they could about the case. On the next page a cued-recall test was performed in which the participants were reminded of the two diagnoses for which they had assigned probabilities and were asked whether these diagnoses brought to mind any additional information. These two memory tests were then repeated for the remaining five cases.

Back to Top | Article Outline

Results

Hypothesis generation. The participants who received the medicalese versions were as likely as were those who received the lay versions to generate each of the diagnoses associated with each case. The diagnosis whose features were presented using medical terminology was named 45 out of a possible 84 times (14 participants × 6 cases), whereas the diagnosis whose features were presented using lay terminology was named 42 out of 84 times. One could argue that the first diagnosis named is the hypothesis that is predominant in the participants' minds, but there was also no difference across conditions of which diagnosis was named first (25 vs. 26 cases for the medicalese versus lay descriptions, respectively). These results are not terribly surprising as the cases were developed to be indicative of both diagnoses. These results also show that any differences observed in the probability judgments did not arise because of differences in the information presented in the two versions that simply make one of the diagnoses more likely to come to mind.

Probability judgments. The participants' belief in a diagnosis, as expressed in probability ratings, was greater when the features indicative of that diagnosis were presented using medical terminology than when the same features were presented using lay terminology. Diagnoses whose features were presented using medical terminology were assigned an average probability rating of 46.5. When the same information was presented using lay terminology, the average probability rating assigned to the same diagnoses dropped to 35.5 (t[83] = 2.31, p = < .05).

Six separate comparisons can be made to ensure the robustness of this result by examining the probability ratings assigned to the diagnoses for each case independently. Table 1 illustrates, for each case, the mean probabilities assigned to both Diagnosis A and Diagnosis B as a function of whether each diagnosis was presented using a technical or lay description of their features. Within each case, the middle column indicates the probability assigned to Diagnosis A. Table 1 reveals that the participants believed the probabilities of these six diagnoses to be greater when the features consistent with Diagnosis A were presented using the technical description than when they were presented using the lay description. This was true for all except Case 3: gastritis versus myocardial infarction. The same can be said for Diagnosis B. The last column of each section of the table indicates for each case that Diagnosis B received a greater probability rating when the features consistent with Diagnosis B were presented using technical description than when the same features were presented using a lay description. Again, Case 3 was the lone exception.

TABLE 1

TABLE 1

Memory test. Finally, a pair of 2 (Terminology: medicalese versus lay) × 6 (Case) ANOVAs were performed on the data generated by participants during the memory test. Examining the features that were manipulated across version of the case histories, ANOVA revealed that significantly more features were recalled when they were presented using medicalese than when they were presented using lay terminology (F[1,152] = 6.88, MSE = 5.716, p = < .02). The participants were more likely to remember features when they were presented using medicalese (mean = 1.123 [out of 3 = 37%]) than they were when the same features were presented using lay terminology (mean = 0.749 [out of 3 = 25%]). The same analysis revealed a main effect of case (F(5,152) = 2.73, MSE = 2.267, p = < .05) but no case-by-terminology interaction (F(5,152) = 0.67, MSE = 0.557, p = > .65).

A second ANOVA was performed on recall of the number of features consistent with either Diagnosis A or Diagnosis B, but not manipulated across version. This analysis revealed only a main effect of case (F(5,152) = 8.25, MSE = 4.912, p < .001). That is, the memorability of non-manipulated features was not affected by whether additional features consistent with the same diagnosis were presented using medicalese (mean = 0.772 [out of 1.5 = 51%]) or lay terminology (mean = 0.623 [out of 1.5 = 42%]).

Back to Top | Article Outline

Discussion

Lingard and Haber have argued that “the essence of physician-to-physician communication is deciding what is worth saying—and what is not.” 6,p.S124 The current study extends this statement in that whether or not something is perceived as worth having been said might be influenced by the way in which it was said. A resident who is told of a particular symptom using medical terminology appears to be more likely to view that feature as being clinically “relevant” than when the same symptom is presented using lay terminology. This was evidenced by the finding that presenting clinical features in medicalese rather than lay terminology increased the probability ratings that residents assigned to the relevant diagnosis. That is, participants were more confident in a diagnosis when the features consistent with that diagnosis were presented using medicalese than when the same information was presented using lay terminology. Furthermore, the use of medicalese increased the likelihood that residents remembered features consistent with that diagnosis. We think the effect on memorability acts as a precursor to the influence on probability ratings. That is, when evaluating the probability of potential diagnoses, features that are presented using medicalese will be better remembered, and therefore more influential, than will features presented using lay terminology. In contrast, it is possible that memorability was increased by disease classification, but this seems unlikely given the findings that (1) the participants were not more likely to generate a diagnosis when the implicating features were presented using medicalese and (2) the memorability of features that supported the diagnosis, but were not manipulated across version, did not change as a function of the terminology used.

In addition, we view the increased memorability of features presented in medicalese as complementary to Bordage and Lemieux's argument that individuals who mentally represent case histories using detailed semantics will have greater diagnostic ability than those who represent cases in more generic terms. 4 Our contention is that a partial explanation of their finding is that semantic qualifiers (and other forms of medicalese) may serve as cues to particular diagnoses. Medical personnel who detect and take advantage of such cues should, therefore, be more likely to arrive at the correct diagnosis. In further exploring the relationship between mental representations and diagnostic ability, Chang and colleagues 7 illustrated that diagnosticians need not represent all possible semantic attributes in order to use them to diagnostic advantage. Participants in their study who generated the correct diagnoses were more likely to use semantic attributes than were those who generated incorrect diagnoses, but still the number of semantic attributes stated by the “correct” group was 2.1 out of a possible 6. The small proportion of attributes generated with semantic qualifiers supports our contention that medicalese (semantic qualifiers in their case) creates a more mnemonically effective cue for the generation of particular diagnoses and that the presence of such a cue will bias diagnosticians' construal of the case as a whole.

Back to Top | Article Outline

References

1. Tulving E. Episodic and semantic memory. In Tulving E, Donaldson W (eds). Organization of Memory. New York: Academic Press.
2. Morris CD, Bransford JD, Franks JJ. Levels of processing versus transfer appropriate processing. J Verb Learn Verb Behav. 1977;16:519–33.
3. Eva KW. The influence of differentially processing evidence on diagnostic decision-making [doctoral dissertation]. Hamilton, ON, Canada: McMaster University, 2001.
4. Bordage G, Lemieux M. Semantic structures and diagnostic thinking of experts and novices. Acad Med. 1991;66(10 suppl):S70–S72.
5. Cunnington JPW, Turnbull JM, Regehr G, Marriott M, Norman GR. The effect of presentation order in clinical decision making. Acad Med. 1997;72(10 suppl):S40–S42.
6. Lingard LA, Haber RJ. What do we mean by “relevance?”: a clinical and rhetorical definition with implications for teaching and learning the case-presentation format. Acad Med. 1999;74(10 suppl):S124–S127.
7. Chang RW, Bordage G, Connell KJ. The importance of early problem representation during case presentation. Acad Med. 1998;73(10 suppl):S109–S111.

Section Description

Research in Medical Education: Proceedings of the Fortieth Annual Conference. November 4–7, 2001.

© 2001 by the Association of American Medical Colleges