Amidst concerns that time spent in direct contact with patients is decreasing,1 research has been conducted to investigate the time physicians actually spent in patient encounters. Using data collected from the National Ambulatory Medical Care Survey (NAMCS) of the National Center for Health Statistics and the American Medical Association’s Socioeconomic Monitoring System (SMS), Mechanic, McAlpine and Rosenthal2 showed that the length of office visits was actually increasing. Their results showed that average office visit durations were 16.3 minutes and 20.4 minutes, based on data from the NAMCS and SMS, respectively. In contrast, based on direct observation, other research3,4 indicated that these values overestimate the time spent. However, average visit duration varied considerably as a function of patient characteristics.5,6
Blumenthal et al.5 found that older patients and the presence of psychosocial problems were associated with longer visit durations and, in another study concentrating on patients 45 years or older, Lo, Ryder and Shorr6 found that the mean visit duration was 17.9 minutes (SD = 8.5). These results indicate that, based on actual clinical interactions, patient characteristics (or conditions) can influence how long a physician will spend gathering information. However, relatively little research has been conducted to investigate the factors that impact timing in simulated clinical situations.
Although timing information was used to establish guidelines for the duration of a clinical skills assessment for medical students and graduates,7,8 the analyses were undertaken simply to determine a suitable limit for modeled clinical interactions. The extent to which the characteristics of the patient affect the expected duration of the medical interview and physical examination can provide guidance as to the specific types, and complexities, of cases that can be modeled and, to some extent, limits on how they should be combined in a test form. More important, studies of “speededness” can provide evidence to support the validity of the assessment. Adequate time limits ensure that factors unrelated to clinical skills and abilities are not reflected in scores and associated interpretations (e.g., pass/fail decisions).
Even though previous research on the Clinical Skills Assessment (CSA) indicated that the 15-minute time limit was adequate,9 this examination was only administered to graduates of international medical schools (IMGs). The purposes of this investigation were to gather information regarding the relationship between encounter time and case characteristics for the Step 2 Clinical Skills (CS) assessment and to provide evidence that the time provided to gather data was adequate.
Since June 2004, a clinical skills examination has been administered as part of United States Medical Licensing Examination (USMLE) Step 2 CS. This is a standardized patient (SP) examination in which the examinee completes 12 encounters with persons trained to portray patients with common clinical complaints.10 Cases are classified by age, clinical presentation, acuity, gender, the presence of physical findings, and station format. Because the examination is administered daily, different sets of cases (test forms) are administered in each testing session. Blueprint constraints control the variety of cases and characteristics appearing in each test form.
The examinee has 15 minutes to interview and examine the SP, and is instructed to ask relevant medical history questions and perform a focused physical examination. Following each encounter, the SP records the medical history questions asked and physical examination maneuvers performed using a case-specific checklist stored on computer. The examinee has 10 minutes to record their summary of the encounter, diagnostic impressions, and initial workup in a postencounter exercise. Although there is a 15-minute time limit, examinees may end the encounter early and use the remaining time allotted for the encounter to complete the postencounter exercise. Encounter start time, the time at which the SP begins recording the checklist information, and the time that the checklist information is submitted are stored in the database.
Examinees who were students or graduates of U.S./Canadian medical schools taking Step 2 CS for the first time were selected for this analysis. Data collected from all administrations of the Step 2 CS in calendar year 2005 (January to December) were reviewed for inclusion in the analyses. Encounter duration was calculated by subtracting the start of the encounter recorded by computer from the time the SP began completion of the case-specific checklist for the encounter. Although SPs are instructed to begin their recording (scoring) as soon as the examinee leaves the room, this calculation of encounter time could overestimate the actual duration of the encounter. Administrations for which encounter information was recorded on paper and entered into the database at a later date were eliminated because it was not possible to capture information on encounter start and end times. The sample included in the analysis consisted of the 194,667 encounters between SPs and U.S./Canadian medical students taking the examination for the first time.
Case characteristics for all cases administered in 2005 were collated, and characteristics of cases used in test administrations were categorized by acuity (acute vs. subacute/chronic), case age (under 18, 18–44, 45–64, and over 65 years), gender, clinical presentation (cardiovascular, constitutional, gastrointestinal, genitourinary, musculoskeletal, neurological, psychiatric, respiratory, women’s health, other), and station format (medical history only, medical history and physical exam, telephone).
Data was analyzed at the encounter level. To test hypotheses regarding the effects of case characteristics on encounter time, a univariate analysis of variance (ANOVA) was conducted with encounter time as the dependent variable. Case characteristics were the independent variables in the model. Because case characteristics were interrelated, main effects were analyzed; interactions were suppressed. This type of analysis estimates the relative contribution of each factor to the variance in encounter duration. Unweighted mean (estimated marginal) encounter times were calculated based on case characteristics. Estimated marginal means control for the potential effect of unequal counts of examinee-SP encounters based on the appearance of different cases in test forms. The proportion of variance (partial eta squared) accounted for by case characteristics was computed.
The encounter time across cases used in 2005 and administrations calculated at the encounter level ranged from a minimum of one minute to the maximum of 15 minutes and the weighted mean encounter time was 13.4 (SD = 2.1) minutes. Most encounters (99.3%) lasted for five minutes or more. Based on the encounters of the 16,222 U.S./Canadian medical school students/graduates taking the examination for the first time, two-thirds of the encounters (n = 129,229) took less than the allotted 15 minutes.
Analyses were conducted to identify the relationship between case characteristics and encounter time. Table 1 provides unweighted means (estimated marginal means) for encounter time based on case characteristics. To determine the effect of case characteristics, the proportion of variance accounted for by each of these characteristics was calculated (partial eta squared). It was expected that certain case characteristics would affect the amount of time spent in encounters with SPs. For example, based on research conducted with actual patients, it was expected that interactions with older patients would take longer than those with younger patients.
The age category for the case was expected to have an effect on encounter time, and this hypothesis was confirmed by the results obtained (F1, 194666 = 1074.8, p < .01). Unweighted mean encounter time for cases classified as 18–44 were slightly shorter (unweighted mean = 10.5, SE = 0.02) than those classified as over 65 (unweighted mean = 11.1, SE = 0.02). Age categorization accounted for 2% of the variance in encounter time. The relationship between encounter time and case classification for clinical presentation was also investigated. There was a statistically significant effect associated with this case characteristic (F1, 194666 = 576.0, p < .01), and this factor accounted for three percent of the variance in encounter time. Differences amongst the unweighted means were small. There was a statistically significant effect found for station format (F1, 194666 = 18,658.1, p < .01). Stations where examinees were expected to complete a focused medical history interview and relevant physical examination took much longer (unweighted mean = 13.8 minutes, SE = 0.01) than history only (unweighted mean = 10.6 minutes, SE = 0.02) or telephone cases (unweighted mean = 8.2 minutes, SE = 0.03).
For cases where the standardized patient’s condition was acute, there was no statistically significant difference in encounter duration (F1, 194666 = 0.1, p = .8). The unweighted mean encounter time was no longer (unweighted mean = 10.9, SE = 0.01) than when the case was classified as subacute/chronic (unweighted mean = 10.9, SE = 0.01). This factor accounted for none of the variance in encounter time (partial eta squared = 0.00). Encounters with cases classified female were of slightly shorter duration (unweighted mean = 10.8, SE = 0.01) than those classified as male (unweighted mean = 10.9, SE = 0.01). Although this result was statistically significant (F1, 194666 = 16.0, p < .01), the effect size estimate indicated that this factor accounted for none of the variance in encounter time. There was a statistically significant difference in encounter duration based on whether there were physical findings in the case (F1, 194666 = 58.0, p < .01). Cases where there were physical examination findings took slightly more time (unweighted mean = 13.7 minutes, SE = 0.02) than those where there were no physical findings (mean = 13.1 minute, SE = 0.01).
To illustrate differences in encounter time based on case characteristics, Figure 1 shows the distribution of encounter time based on station format. As expected, encounter time varied considerably based on whether examinees were expected to gather medical history or to ask both history questions and perform physical examination maneuvers. This figure shows that there was also considerable variation in encounter time for telephone cases.
The purpose of the current study was to investigate the relationship between case characteristics and encounter time in a standardized patient examination. The time limits set for the examination were found to be adequate, with 66% of the encounters of first-time U.S./Canadian medical students/graduates taking less than the 15-minute limit. The relationships between case characteristics and encounter time were intuitive as well, with stations requiring both history and physical examination taking more time than those requiring only a medical history interview. Examinees adjust the time spent based on patient characteristics. There is considerable variation in time spent based on station format and, interestingly, time spent in the telephone station is most variable. This suggests that the examinees may find the format challenging and unusual.
Although there were large differences in time spent based on case clinical presentation and station format, there was only a weak relationship between encounter time and case classification. Consistent with what physicians actually do in practice, there was an effect associated with the age of the patient. Nevertheless, given that test forms are balanced for these characteristics, the overall time-demand impact on individual examinees should be negligible.
Although these findings provide some information about examinee time use in a standardized patient examination, the study is not without limitations, some of which provide guidance for future research. First, encounter time was estimated based on when the SP began scoring. Although SPs are instructed to begin scoring as soon as the encounter ends, there can be delays at this point, and thus our timing data may overestimate the actual time spent. Second, the effect of administration sequence on encounter time was not examined. Previous research9,11 indicates that sequence may have an effect on both time usage and the resultant scores. Third, we did not examine the complex interactions among case and examinee characteristics. It is reasonable to expect that the interaction between physician and patient characteristics could have some impact on how the interview is conducted, potentially speeding up or slowing down the data gathering process. Fourth, we did not look specifically at SP characteristics. For example, do multiple SPs portraying the same case result in similar encounter times, on average, controlling for examinee ability? The results of this type of research can be used to inform quality assurance and test development processes. Last, examinees are given the opportunity to finish the patient interview and assessment early, leaving more time for the post encounter exercise (patient note [PN]). Depending on the motivation for doing so (e.g., belief that the postencounter exercise is more difficult or more important), the encounter timing data may be biased. Here, it would be useful to look at both data gathering and PN scores as a function of time on task.
The results of our study suggest that encounter times vary logically as a function of the inherent information gathering demands of the case. In developing cases for standardized patient examinations, it is important to ensure that there are sufficient cases in the bank to provide a variety of challenges to examinees. Because a detailed blueprint is used to generate test forms (case mixes) and examinees tend, on average, to use fewer than 15 minutes, it would appear that the Step 2 CS timing limits are both adequate and fair.
1 Kassirer JP. Doctor discontent. N Engl J Med. 1999;340:584–8.
2 Mechanic D, McAlpine DD, Rosenthal M. Are patient’s office visits with physicians getting shorter? N Engl J Med. 2001;344:198–204.
3 Gilchrist VJ, Stange KC, Flocke SA, McCord G, Bourguet CC. A comparison of the National Ambulatory Medical Care Survey (NAMCS) measurement approach with direct observation of outpatient visits. Med Care. 2004;42:276–80.
4 Gottschalk A, Flocke SA. Time spent in face-to-face patient care and work outside the examination room. Ann Fam Med. 2005;3:488–93.
5 Blumenthal D, Causino N, Chang YC, et al. The duration of ambulatory visits to physicians. Fam Pract. 1999;48:264–71.
6 Lo A, Ryder K, Shorr RI. Relationship between patient age and duration of physician visit in ambulatory setting: does one size fit all? J Am Geriatr Soc. 2005;53:1162–7.
7 Ziv A, Boulet JR, Burdick WP, Friedman Ben-David, M, Gary NE. The use of national medical care surveys to develop and validate test content for standardized patient examinations. In: Melnick D, ed. Proceedings of the Eighth Ottawa Conference on Medical Education and Assessment. Philadelphia: National Board of Medical Examiners, 2000:99–105.
8 Boulet JR, Gimpel JR, Errichetti AM, Meoli FG. Using national medical care survey data to validate examination content on a performance-based clinical skills assessment for osteopathic physicians. J Am Osteopath Assoc. 2003;103:225–31.
9 Chambers KA, Boulet JR, Gary NE. The management of patient encounter time in a high-stakes assessment using standardized patients. Med Educ. 2000;34:813–17.
10 USMLE. Step 2 Clinical Skills (CS) content description and general information. Philadelphia: Federation of State Medical Board of the United States, National Board of Medical Examiners, 2006.
11 McKinley DW, Boulet JR. The effects of task sequence on examinee performance. Teach Learn Med. 2004;16:18–22.