How do physicians perform under conditions of time pressure? The answer to this question has potentially significant consequences for the quality of health care. Presently, no unequivocal answer is available, mainly because direct experimental evidence is lacking. However, the broader literature on the working conditions of physicians suggests that time pressure may have negative effects on performance.
Physicians report that they often work under time constraints and experience stressful working conditions. In a study of working conditions in primary care, Linzer et al1 found that 53% of physicians complained about time pressure during office visits. The pressure experienced was in turn associated with low job satisfaction, stress, burnout, and intent to leave the practice. A study of 115 internal medicine residents found that 75% of them showed signs of burnout, and those residents were more likely to self-report suboptimal patient care compared with residents without signs of burnout.2 DiMatteo et al3 found that dissatisfied physicians negatively influenced patient medication adherence. Moreover, physicians in another study attributed a large number of incidents affecting patient care (e.g., sloppy care, angry communication with patients, serious medical errors, and even death) to stress symptoms such as tiredness, high workload, and anxiety.4 In a qualitative study, Manwell et al5 identified time pressure as a major factor affecting the quality of patient care, particularly communication with patients.
Taken together, studies such as these suggest that stressful working conditions have negative effects on patient care and may lead to medical errors. Yet, although time pressure seems to have negative effects on working conditions, the extent to which it directly negatively influences diagnostic accuracy is not known. Indeed, it may be possible that being subjected to severe time constraints makes a physician’s working life stressful without affecting his or her ability to arrive at an accurate diagnosis. The present study attempts to clarify this issue.
When investigating the relationship between time pressure and diagnostic error, one must consider the cognitive processes involved in making a diagnosis. It is generally assumed that early in an encounter with a patient, and based on limited data, the physician forms a few diagnostic hypotheses that are tested against information gathered subsequently.6 The emergence of these early diagnostic hypotheses is thought to be a spontaneous and automatic process without the conscious intervention of the physician, whereas searching for evidence in support of these hypotheses is supposed to be a more conscious and analytic process. Deciding which information is needed or relevant is not always a straightforward task, especially when the case is atypical and presents ambiguous and sometimes misleading signs and symptoms.
This description of the process of clinical reasoning fits with current dual-process theories of decision making, which suggest that two distinct psychological processes are at work when a clinician is diagnosing a case: System 1 nonanalytical reasoning and System 2 analytical reasoning. Nonanalytical reasoning, also called heuristic reasoning, depends on rapid, unconscious pattern recognition during which prior examples or illness scripts stored in long-term memory are retrieved.7 This type of reasoning is quick, intuitive, implicit, contextualized, and typically efficient in diagnosing routine cases.8 Despite its efficiency, however, System 1 reasoning is thought to be vulnerable to errors.9 On the other hand, System 2 reasoning is slow, reflective, sequential, effortful, and particularly used by physicians to diagnose complex cases.7 This is because under System 2, the available information is processed in a more deliberate and systematic manner. System 2 reasoning may eventually fail, but because of the systematic processing involved, it has been suggested that this type of reasoning minimizes errors generated through System 1 reasoning. As Evans and Curtis-Holmes10(p383) put it:
Biases can arise because the heuristic system fails to represent logically relevant features of the problem or represents features that are logically irrelevant to the problem. The evidence suggests that such heuristically generated biases can be inhibited, at least to some extent, by analytic system intervention.
Although diagnostic errors have typically been attributed to cognitive biases when using heuristics in rapid System 1 reasoning,9,11 little is known about how time pressure contributes to such errors. It seems reasonable to assume that having limited time to think and reason about a medical case will have adverse effects on diagnostic accuracy, as compared with a situation in which there is sufficient time to consider all the available information and to systematically evaluate possible hypotheses to eventually reach the most accurate diagnosis.
There is some evidence in the literature (although not the medical literature) suggesting that this is indeed the case.12 Svenson and Maule13 found that individuals under time pressure reported feeling impaired in their decision-making ability. In addition, the results of a study by Evans et al14 suggest that time pressure restricts the generation of inferences based on initial diagnostic hypotheses, as compared with a situation without time pressure. As a consequence, if there is limited time to evaluate diagnostic hypotheses in an analytical manner, the physician would tend to rely more on nonanalytical processing to compensate for the lack of time available. Nonanalytical reasoning is more vulnerable and prone to bias.12 One of the most common cognitive biases affecting nonanalytical reasoning is premature closure—that is, the failure to consider relevant alternatives after the initial diagnosis,7 such as when the clinician formulates a diagnosis based on pattern recognition or existing illness scripts without considering other possibilities. Although premature closure is suggested to be the most common bias, it is only 1 of more than 40 types of bias that have been identified as affecting clinical reasoning.9 For instance, belief bias appears to be a likely candidate when a physician is experiencing time pressure.14 Belief bias is the tendency to evaluate a case based on one’s initial belief despite being presented with new information that contradicts that belief. In one study, belief bias was found to increase with rapid responding, which ultimately reduced decision accuracy.10 Results of similar studies suggest that belief bias can be counteracted by using an analytic-reasoning approach.15,16 Recently, new neuroscientific evidence has emerged which suggests that the right inferior frontal cortex (IFC) plays a role in minimizing belief bias.17 However, activation of the right IFC takes time. The researchers found that when activation of the right IFC was restricted by time pressure, the heuristic system could not be inhibited, which resulted in mistakes.
Other studies support the notion that people adapt their thinking strategies to deal with the limited time available. Mandler18 found that when individuals were put under stress during decision making, the number of alternative solutions they produced was limited compared with those produced by a control group. Likewise, Evans et al14 found that when individuals were put under time pressure, there was a decrease in the number of generated inferences.
Although the studies reviewed here suggest that time pressure has potentially detrimental effects on diagnostic accuracy, it should be noted that many of these studies were carried out with rather artificial types of problems that people hardly encounter in everyday life. In addition, the number of experimental studies examining the effects of time pressure on the actual diagnostic performance of physicians is limited. Only recently, two studies have been published that examined the issue, involving residents and physicians diagnosing medical cases.19,20 These studies, though, seem to suggest that time pressure has no particular effect on diagnostic accuracy. For instance, Norman et al19 conducted a prospective controlled study involving residents, in which one cohort was requested to diagnose 20 medical cases as quickly as possible without making errors, and their performance was compared with the performance of another cohort that was instructed to be careful and slow. The results suggest that there was no significant difference in diagnostic accuracy between the quick and slow conditions, although the residents in the slow condition spent more time on diagnosing. In a follow-up study by Monteiro et al20 using emergency physicians and residents, the results were replicated, suggesting that there was no significant difference between the fast and slow conditions in diagnosing medical cases accurately.
The objective of the present study was to investigate the effect of time pressure on diagnostic accuracy. To that end, we presented to two groups of residents eight clinical cases to diagnose. The experimental group received instructions to be fast and, after diagnosing each case, received feedback on how much time was left and how far they were behind schedule. This feedback was intended to encourage them to work faster. The control group diagnosed the same cases without any reference to time pressure. We hypothesized that the group under time pressure would spend less time diagnosing the cases than the control group. In addition, we hypothesized that the diagnostic accuracy score of the group under time pressure would be significantly lower than that of the control group. Possible discrepancies between our findings and those of the studies described above are considered in the Discussion section.
The study was a randomized controlled experiment conducted in a six-week period during April–May 2014. The independent variable was time pressure (time pressure versus no time pressure), and the dependent variables were mean response time and mean diagnostic accuracy score. Ethical approval to conduct the study was granted by the institutional review board of the National Guard Health Affairs, Riyadh, Saudi Arabia.
Setting and participants
The internal medicine residency program in Saudi Arabia lasts four years and covers general internal medicine as well as the subspecialties. It consists of two stages, with junior (first- and second-year) and senior (third- and fourth-year) residents.
Forty-four senior internal medicine residents were included in this study. They were recruited from three main hospitals in the Riyadh region (King Abdulaziz Medical City, King Khalid University Hospital, King Saud Medical City). Subsequently, they were randomly assigned to either the experimental (time pressure) or control (no time pressure) condition. After randomization, no significant differences emerged in terms of years of training or age.
Participation was voluntary, and informed consent was obtained from those who participated. Because of the nature of the hypotheses, the specific purpose of the study was not disclosed to the participants, because this potentially would influence the validity of the data. Participants were debriefed about the true aim of the study after the experiment.
Eight written clinical cases were used for this study, with the following diagnoses: hyperthyroidism, liver cirrhosis, inflammatory bowel disease, Addison disease, aortic dissection, acute viral hepatitis, pseudomembranous colitis, and acute appendicitis. The cases were written by internal medicine experts and had been used in previous studies.21,22 Each case was presented in English and was composed of a brief description of a patient’s medical history, complaints, signs, and symptoms; physical examination findings; and laboratory test results. See Box 1 for an example case.
Box 1Example Case From the Eight Written Cases Used in the Studya
A 38-year-old man with a 12-year history of ulcerative colitis was brought to the hospital with diarrhea containing blood, and abdominal pains. He felt fine until 2 weeks ago. Then he had a respiratory infection, influenza-like, and he was prescribed with erythromycin for 10 days. After 6 days of erythromycin he began having severe diarrhea with blood and mucus. The patient had traveled for holidays three months ago, and several fellow travelers had complaints of fever, nausea, and watery diarrhea.
BP: 115/75 mm Hg; pulse: 100/min; temperature: 38°C
Abdomen: distended, diffusely painful on palpation, no audible peristaltic sounds, no signs of peritoneal irritation.
White cell count: 7,200/mm3, with 60% segmented cells and 19% bands
Abdominal X-ray: dilated ascending colon; the transverse colon shows an aneurysm-like dilatation of the splenic flexure, and there are visible fluid levels.
Rectosigmoidoscopy: diffusely erythematous and friable mucous membrane.
aThe diagnosis for this case is pseudomembranous colitis. Study participants were senior (third- and fourth-year) internal medicine residents recruited from main hospitals in Riyadh region of Saudi Arabia (King Abdulaziz Medical City, King Khalid University Hospital, King Saud Medical City). Residents were randomly assigned to the experimental group (time pressure; n = 23) or the control condition (without time pressure; n = 19).
The experiment was conducted in computer labs at the participating hospitals. The cases were presented, and data were collected using E-Prime 2.0 (Psychology Software Tools, Inc., Pittsburgh, Pennsylvania). On arrival at the computer lab, participants were randomly allocated to either condition and seated separately, so that each participant could only see his or her own computer screen. Participants were asked to work silently without interruptions (no phone calls, talking, etc.). They were informed about the broad purpose of the study—to understand the nature of clinical problem solving—and then were asked to log into the computer program. The computer program provided all further instructions regarding the task.
For the time-pressure (experimental) group, the on-screen instructions stated that during daily practice all physicians experience lack of time because there are usually more patients to be seen than there is time available. In addition, the instructions stated:
With the present study we are interested in exploring whether providing feedback about the pace of your work (relative to what remains to be done) would help you deal with time constraints. Therefore, you will receive after each case you have diagnosed, information about how much work still needs to be done and how much time is left for doing so. If time is running short, you can adapt by working your way faster through the next cases. It helps if you actively imagine yourself in a busy emergency room. There is a large number of patients to be seen during the rest of your shift and only very limited time is left. You will probably not be able to see all the cases, but try to work quickly, without compromising accuracy. Do your best to diagnose as many cases as possible.
To manipulate the perception of time pressure in the experimental condition, a visual cue using two bars was provided on the screen after each case. The number of cases still to be seen was represented by a green bar, whereas the time remaining was shown by means of a red bar (see Figure 1). This information was independent of the actual performance of the participant; it was designed to create the impression that the participant was permanently lagging behind schedule. In addition, text was generated between the two bars to provide feedback about progress. The textual feedback, which was also independent of the participant’s actual performance, was intended to be stressful, suggesting that the participant was increasingly falling behind schedule and had to hurry to catch up. In reality, there was no time restriction for responding to individual cases or for the overall experiment. The following are examples of sentences used to induce stress-related time pressure:
You are on track, but please try to work a bit faster!
Fast, but still spent more time than what was available for the first two cases
Necessary to speed up, much behind schedule
For the control group, the on-screen instructions did not contain any reference to time. Participants were merely informed that they had a set of cases to diagnose:
A clinical case will appear in each screen. Please read the case and type the most likely diagnosis. Type only one complete and precise diagnosis which you find to be the most accurate for the case presented.
Participants in the control group did not receive visual cues or textual feedback on their progress as they worked through the cases.
In both conditions, after receiving the initial instructions, participants were given two example cases to practice on before diagnosing the eight actual cases. Cases were presented in random order, and participants typed in their diagnosis for each. Response time was recorded in seconds for each case.
Two general practitioners (F.T., M.M.), who were blinded to the experimental condition, independently scored diagnostic accuracy using the following scale: 0 = incorrect, 0.5 = partially correct, and 1 = correct. A diagnosis was considered correct when it included the main component of the diagnosis or the core diagnosis (e.g., “acute hepatitis A infection” in the case of acute viral hepatitis). A diagnosis was considered partially correct when one of the constituent elements of the diagnosis appeared, but the main diagnosis was not cited (e.g., “hepatitis” in the case of liver cirrhosis). A diagnosis was considered incorrect when it did not correspond to the main diagnosis and none of the constituent elements of the diagnosis appeared (e.g., “acute myocardial infarction” in the case of aortic dissection). The interrater agreement was 90.3%, and disagreements were resolved through discussion.
For each participant, a mean score of diagnostic accuracy and a mean response time were generated for all eight cases. A one-way analysis of variance (ANOVA) was performed to determine differences in diagnostic accuracy and response time between the two conditions. Significance level was set to P = .05 for all tests. Data were analyzed using SPSS version 21 (IBM Corp., Armonk, New York).
All participants completed all eight cases. Two of the 44 participants were excluded from the analysis because the descriptive statistics revealed that both individuals constituted significant outliers in terms of response time. One outlier was from the experimental group (mean response time = 327.78 seconds), and one was from the control group (mean response time = 348.56 seconds). These values are more than four standard deviations (SDs) above the mean response time for their respective groups, and there are thus reasons to believe that these two participants responded in an atypical manner. The remaining 42 participants included 37 men and 5 women. Their mean (SD) age was 29.1 (4.44) years, and their mean (SD) clinical experience was 3.79 (2.33) years. There were 23 participants in the experimental condition and 19 in the control condition.
The mean (SD) response time for the time-pressure group was 96.00 (28.69) seconds (95% CI, 83.60–108.41) and for the control group was 151.97 (54.29) seconds (95% CI, 125.80–178.13). The results of the ANOVA indicated that the response time for participants in the time-pressure condition was significantly lower than the response time for the control condition participants: F(1, 41) = 18.32, P < .001, η2 = 0.31. This difference in response time between the two groups suggests that the experimental treatment did indeed work; participants in the time-pressure condition diagnosed the cases significantly faster than participants in the control condition.
The mean (SD) diagnostic accuracy score for the time-pressure condition was 0.33 (0.23) (95% CI, 0.23–0.43) and for the control condition was 0.51 (0.19) (95% CI, 0.41–0.60). The results of a second one-way ANOVA revealed that the time-pressure group had a significantly lower diagnostic accuracy score as compared with the control group: F(1, 41) = 6.90, P = .012, η2 = 0.15. This outcome suggests that participants in the time-pressure condition made on average 37% more errors than participants in the control condition.
Discussion and Conclusions
This study investigated the effects of time pressure on diagnostic performance. We hypothesized that (a) the participants in the time-pressure condition would spend less time diagnosing the eight medical cases than the participants in the control condition, and (b) the diagnostic accuracy score of the participants in the time-pressure condition would be significantly lower than that of the participants in the control condition. We reasoned that participants in the time-pressure condition would spend less time processing the available information analytically and therefore would have to rely more on initial hypotheses produced through nonanalytical reasoning. Under these circumstances, analytical error-correction mechanisms cannot do their work properly.12 To test our hypotheses, we conducted an experiment in which internal medicine residents diagnosed eight written clinical cases either under time pressure or without time pressure.
The results of our study demonstrate that the experimental treatment was successful in manipulating the perception of time pressure: As we hypothesized, participants in the time-pressure condition spent less time per case (on average 56 seconds less) than did control group participants. Moreover, the participants in the time-pressure condition made significantly more errors (37% more on average) than the control group did. In line with dual-process theory,12 we interpret these findings as follows: If there is insufficient time to fully process a medical case, clinicians rely more on System 1 than System 2 reasoning, because System 1 reasoning is intuitive, effortless, and fast. However, a tradeoff of this intuitive and faster approach is that it may be prone to errors because not all information will be analytically considered and processed. For instance, the results of several studies suggest that time pressure reduces the quality of decision making,23 causes switching to simpler strategies,13 and results in a preference for low-risk judgments.24 Moreover, decisions made under time pressure tend to be less accurate and more prone to cognitive biases, such as premature closure7 and belief bias.10,14
Throughout this article, we have assumed that time pressure mainly affects System 2 reasoning: The physician simply does not have enough time to systematically and analytically process the evidence supporting or falsifying his or her initial hypotheses. However, an alternative explanation, not suggested by the present formulation of dual-process theory, is also possible. Time constraints may affect nonanalytical System 1 reasoning as well—or even exclusively. Perhaps time pressure constrains the number and the quality of initial hypotheses generated. If these initial hypotheses are smaller in number and less relevant to the patient problem at hand, then the diagnostic process as a whole suffers. We were, however, not in a position to test this idea as it would have involved asking participants to indicate all diagnostic hypotheses that came to mind as part of the diagnostic process. In the present study, we asked only for the most likely diagnosis.
Our findings regarding the influence of time pressure on diagnostic accuracy are at variance with two recent studies on the effect of time constraints on diagnostic reasoning.19,20 Norman et al19 asked groups of residents to process a case either quickly or slowly. The instructions for the Speed cohort emphasized that participants should be as quick but as accurate as possible, whereas the instructions for the Reflect cohort emphasized thoroughness and care. For the Speed cohort, a red timer button was displayed on screen that showed elapsed time on each case. Although the Speed cohort spent about 30% less time on diagnosing the cases than the Reflect cohort, no differences in diagnostic accuracy emerged, suggesting that time pressure has no effect on diagnostic accuracy. Using a similar design, Monteiro et al20 replicated these findings.
How can the discrepancies between our findings and theirs be explained? We will discuss this incongruity here at some length, because doing so may clarify why time pressure in some cases negatively affects performance while effects are absent in other cases.
A first and straightforward explanation for the differences is that the amount of pressure put on the participants in Norman and colleagues’19 study may have been less than in our study. The participants in their Speed condition were asked to be quick but as accurate as possible—accuracy was emphasized twice in the instructions. They also received information about the amount of time that had elapsed. In our study, the instructions to the time-pressure group emphasized time constraints and the importance of working fast six times, while accuracy was mentioned once. More important perhaps, the visual and textual feedback provided to our participants on their performance was manipulated such that they were always behind schedule. This may explain why we found significant differences in diagnostic accuracy between the time-pressure and control groups whereas the other two studies did not. However, what speaks against this explanation is that Norman et al also found an effect on processing time between the two groups (though their response-time difference was smaller compared with ours). If our findings are to have validity, it must mean that time used for diagnosing a case is a less important indicator of what physicians are doing while processing a case than is often thought.
There are signs that this may be the case. Although another study25 found that amount of time spent on a case was inversely correlated with diagnostic accuracy (i.e., the more time spent, the more mistakes made), Mamede et al26 demonstrated that under some conditions spending more time leads to better diagnoses. A similar finding also emerged from a recent large study27 of residents’ responses on an internal medicine certification examination, in which further reflection on initial responses improved diagnostic performance, especially for more complex cases. In addition, in the study by Norman et al19 discussed here, residents in the slow condition did not make more mistakes (but rather, slightly fewer) than those in the fast condition. Perhaps time spent on diagnosing a case is a by-product of the reasoning processes involved rather than a crucial causal determinant of performance.
A second possible explanation takes into account the difficulty level of the cases used in Norman and colleagues’19 study. Those cases were described as fairly complex, and the rather modest performance of both groups (44%–45% accurate diagnoses) attests to this level of difficulty. It may be that these cases on average were so difficult that any possible effect of time pressure was simply masked by the difficulty level: Even those participants who spent more time could not make much of it. This hypothesis finds support in our own data. When we selected the four most difficult cases (out of the eight used) and repeated the ANOVA, the difference in performance disappeared: F(1, 41) = 3.33, P = .08, η2 = 0.08. The difference in response time remained intact: F(1, 41) = 15.37, P < .001, η2 = 0.28. This post hoc analysis replicates the findings presented by Norman et al.19
There is still another potential explanation for some of the discrepancies. It involves the level of expertise of the participants. Monteiro et al20 found no effect of time pressure, both for residents and experienced emergency physicians, although the participants in the fast condition used less time to arrive at diagnoses. However, when comparing diagnostic performance between residents and emergency physicians, their results suggest that the emergency physicians were more accurate. It may therefore be possible that those with more experience are simply less susceptible to attempts to put pressure on them. If time pressure hinders mainly System 2 processes, less experienced physicians, who tend to rely more on this type of reasoning relative to their more experienced colleagues, would be more prone to suffer the effect of time pressure. To examine whether there were any interaction effects between time pressure and experience in our study, we divided our sample into less versus more experienced residents using the median as cutoff point and conducted a one-way ANOVA in which we included only the more experienced participants from the time-pressure and control groups. The results of this post hoc analysis—F(1, 20) = 0.21, P = .65, η2 = 0.01—are in line with the findings of Monteiro et al,20 suggesting that time pressure does not influence performance. However, differences in processing time between the participants in the time-pressure and control groups remained intact: F(1, 20) = 10.82, P = .004, η2 = 0.36.
Our study has several limitations. First, the mechanism mediating between the amount of pressure experienced by the participants and the mistakes they made remained largely elusive. Some authors point at the role of emotion, in particular stress, as a mediating factor.28 Stress would in this view lead to more superficial processing of the information in the case. Our design did not allow for direct measurement of such influence. Second, the participants were not experienced physicians but, rather, residents in their third and fourth years of training. Our post hoc analyses suggest that more experienced physicians are less susceptible to the negative effect of time pressure, possibly because they rely more on nonanalytical reasoning and require less analytical reasoning.29 A third limitation is the use of written cases to evaluate the effect of time pressure on diagnostic accuracy, which may be viewed as restricting the generalization of findings to real settings. Conducting experiments such as the one described here in a real setting is not feasible, as unavoidable variability in patient presentation would add too much noise. In addition, it has been shown that clinical case scenarios compare favorably to other methods used to evaluate quality of clinical practice, such as chart abstraction or standardized patients,30,31 and are a reliable and valid method to detect differences of performance between groups of physicians.32
In summary, it seems that the presence or absence of an effect of time pressure on diagnostic accuracy is moderated by both the difficulty level of the cases and the level of experience of the physicians involved. If a case is difficult, having more time to diagnose it may not help. If the physician is experienced, having limited time may not negatively affect his or her performance. However, there appears to be a window within which time pressure is influential. When a case is not too difficult (without being obvious) and the physician is not that experienced (e.g., a resident), time pressure may play a negative role by obstructing System 2 analytical processing, by interfering with System 1 nonanalytical processing, or both. If one believes—as we do—that the health care system is populated with patients presenting not-too-complex problems and with physicians of intermediate experience levels, one may assume that negative effects of time pressure on diagnostic performance are endemic.
Acknowledgments: The authors would like to acknowledge Dr. Tarfa Almoharib, Department of Dentistry, Security Forces Hospital, Riyadh, Saudi Arabia, for her help in recruiting the residents. The authors also thank the residents who dedicated their limited free time to participate in the research.