Secondary Logo

Journal Logo

Forecasting and Perception of Average and Latest Hours Worked by On-Call Anesthesiologists

Dexter, Franklin, MD, PhD*; Epstein, Richard H., MD; Elgart, Richard L., MD; Ledolter, Johannes, PhD

doi: 10.1213/ane.0b013e3181b0ffcc
Economics, Education, and Policy: Research Reports
Free

BACKGROUND: We studied the value of providing information to anesthesia providers about the length of time typically worked during on-call shifts. The mean time at which a shift ends can be used for purposes of trades, payments, or reverse auctions, because the mean is proportional to the total time. The 80th percentile (with a suitable upper confidence limit for uncertainty due to limited sample sizes) can be used for judging the earliest time by which after-work activities reasonably can be planned.

METHODS: (A) Three years of operating room (OR) information system data were analyzed. Dependent variables were the earliest times when the numbers of ORs running were always ≤6 ORs, ≤4 ORs, and ≤2 ORs. We progressively built linear regression models for each of the three dependent variables using day of the week, scheduled number of cases, scheduled hours of cases (including turnovers), and linear time trend. Calculations were repeated after excluding residuals. Calculations were repeated using regression trees. (B) Anesthesiologists were surveyed about their perceptions of the mean and 80th percentiles.

RESULTS: (A) For the three thresholds and two end points (mean and 80th percentile), differences among days of the week were as large as 45 min. Differences between end points for the same weekdays were as large as 245 min. Comparatively, additional knowledge about the number or hours of cases provided in the late afternoon on the working day before surgery reduced the mean absolute error by only 4.1–6.0 min. Results were insensitive to a variety of analytic methods. Information available more days before the day of surgery (e.g., 1 wk) would have had even less incremental predictive value. (B) The mean absolute error of anesthesiologists’ estimates for 80th percentiles was 60 min, principally because of underestimation of the 80th percentiles. More than half (69%, P = 0.0003) of anesthesiologists’ estimates for 80th percentiles had error >30 min, whereas errors of this magnitude were less for the mean (44%, P = 0.0004).

CONCLUSIONS: Historical data from OR or anesthesia information management systems, or from anesthesia billing systems, can be used months before staff scheduling to provide insight to anesthesia providers on respective calls. The data are useful because experience provides limited intuition. Updates on scheduled workload available closer to the day of surgery provided only marginal increases in knowledge over the use of historical data.

From the *Division of Management Consulting, Departments of Anesthesia and Health Management and Policy, University of Iowa, Iowa; †Department of Anesthesiology, Jefferson Medical College, Philadelphia, Pennsylvania; and C. Maxwell Stanley Professor, Department of Management Sciences, University of Iowa, Iowa.

Accepted for publication May 22, 2009.

Franklin Dexter is the Section Editor of Economics, Education, and Policy for the Journal. This manuscript was handled by Steve Shafer, Editor-in-Chief, and Dr. Dexter was not involved in any way with the editorial process or decision.

The University of Iowa performs statistical analyses for hospitals and anesthesia groups. Dr. Dexter receives no funds personally other than his salary from the State of Iowa, including no travel expenses or honoraria, and has tenure with no incentive program. RHE is President of Medical Data Applications, Ltd., whose CalculatOR™ software includes the analyses considered in this article.

Address correspondence and reprint requests to Franklin Dexter, MD, PhD, Division of Management Consulting, Department of Anesthesia, University of Iowa, Iowa City, IA 52242. Address e-mail to Franklin-Dexter@UIowa.edu or web site www.FranklinDexter.net.

Months before the day of surgery, an anesthesia group can use statistical methods to determine its staffing needs.1 Often, the numbers of operating rooms (ORs) to be run at certain times on workdays are specified contractually with hospital administration, ideally based on statistical assessments of historical workload.1,2 Staff scheduling is then done in accordance with the group’s staffing obligations.3–6 When anesthesiologists are given the choice to work different numbers of hours for different compensation, they choose to work significantly different total hours.7 Therefore, if information about hours worked when on-call were provided to anesthesia providers when they make their scheduling decisions, such information would be useful if new and accurate.

In this article, we study the potential value of providing information to anesthesia providers about the length of time typically worked during available shifts. The mean time at which a shift ends can be used for purposes of trades, payments, or reverse auctions,8 because the mean is proportional to the total time. The 80th percentile (with a suitable upper confidence limit for uncertainty due to limited sample sizes) can be used for judging the earliest time by which other activities reasonably can be planned (e.g., attending a child’s school activity or asking one’s spouse to wait for dinner).

One way we assessed the value of providing the mean and 80th percentile was to survey anesthesiologists at a department to learn whether they have accurate (≤30 min) perceptions of the mean and the 80th percentile. If this were the situation, then providing this information would not be of value. We hypothesized that most people would have reasonably good insight about the mean but poor insight into the 80th percentile of the time that they work on different shifts.

The other way we assessed the value of the statistics was to test whether information available from the OR scheduling system can provide managerially useful updates of the mean and 80th percentile. If this were the situation, then providing the mean and 80th percentile just before staff scheduling would be of less value, and vice versa. The additional information available about OR scheduling that we studied were those known when the date of surgery is selected for each case, specifically the total numbers of cases scheduled and the total hours of cases scheduled. These data have previously been shown to be valid predictors of total workload of individual facilities for trends over time and for variation among workdays.9–11 Information on assignment of cases to specific ORs was not used because at the tertiary facilities for which our study is relevant, case assignments are often not made until the afternoon before surgery and even thereafter are often changed. For example, at one such facility, 25.7% ± 0.2% (standard error) of cases were moved or added-on after the “final” schedule was posted the day before surgery,12 and those cases affect the hours worked by the people on-call.

Back to Top | Article Outline

METHODS

Statistical Analysis of Historical Data

The times at which cases ended were obtained from the study hospital’s OR information system for its main OR suites, which exclude non-OR anesthetics, labor and delivery, and gastrointestinal endoscopy procedure rooms. The dates studied were 39 4-wk periods from Monday December 5, 2005, to Friday November 28, 2008. Holidays and days with (planned) exceptionally low caseloads (e.g., between Christmas and New Year) were excluded. Cardiac and hepatic transplantation cases were excluded, because the studied hospital has separate anesthesiologists and OR nurses who care for these patients.

The dependent variables studied were the earliest times when a member of the anesthesia group’s call team could leave the hospital (Table 1). These were defined in terms of the earliest times when there were:

Table 1

Table 1

  1. ≤2 ORs, when the 2nd call person can leave.
  2. ≤4 ORs, when the OR director who starts work at 11 am can leave.
  3. ≤6 ORs, when the 3rd call person working since 7 am leaves.

To obtain the dependent variables, the numbers of cases running during each 15 min increment were calculated including turnovers, limiting consideration to turnovers 60 min or less. Turnovers were included because anesthesia providers are working during these intervals to transfer care of the prior patient and to setup for the next case. Considering each 15 min interval backward from midnight, we determined for each day the time after which no more ORs than specified by a threshold were in use. Figure 1 shows an example of the time in hours after 12 noon when the number of ORs running was equal to or less than the threshold of two ORs.

Figure 1.

Figure 1.

Figure 2.

Figure 2.

Two summary statistics of the three dependent variables were calculated (see Introduction, Table 1). The mean is proportional to total hours worked and is thus relevant to auctioning,8 paying for, selling, and/or trading shifts. However, because the sample mean was sensitive to the outliers present for some end points on some weekdays, we report the trimmed mean, with 10% trimmed from each end.13 To incorporate statistical error for the 80th percentile, we report the 95% upper confidence limit (bound), as calculated using the Clopper-Pearson method.14,15 We studied the 80th percentile for two reasons. First, the corresponding incremental cost of working later than this time corresponds to 4× regular hourly rate,16–18 which is very high. Second, the 80th percentile was simple to explain intuitively as it implies that the anesthesiologists assigned to the shift will work later than this time only once every 5 days (i.e., once a week). Note that with our 3 years of data, the 95% upper prediction limit for the 80th percentile was essentially the same as reporting the 90th percentile (Fig. 1, Table 1).

We progressively built linear regression models for each of the three dependent variables using day of the week, scheduled number of cases, scheduled hours of cases including turnovers, and linear time trend (Table 2) (Systat 12, Systat Software, San Jose, CA). Our model used only the independent variables on the working day before surgery, because by then almost all elective cases had been scheduled (Fig. 2) and at that time maximum information was available (see Discussion). To determine the predictive value of the independent variables, we calculated the deleted residuals (Table 3). These are the differences between predicted and observed values, with each prediction made using the linear model without the observed value.19 For example, each single value was predicted using the linear model with parameters estimated from the other 754 numbers.*

Table 2

Table 2

Table 3

Table 3

Among the 2265 combinations of 755 dates and three end points, there were 12 with externally studentized residuals19 larger than three. We repeated all calculations excluding these 12 outliers as one of our sensitivity analyses.

As another sensitivity analysis, we repeated the regression analyses using a different type of regression, namely regression trees with a least absolute values loss function (Systat 12). Regression trees are automatically created for predicting relationships including interactions. We used least absolute values as the loss function so that the mean absolute error could be compared directly with that of the preceding classical multivariable linear regression. To compute a regression tree for each of the three dependent variables, and to maximize the potential predictive ability of the model, we used all independent variables in Table 1 plus all possible sums and differences of those variables. Jackknife estimation was used to calculate the standard error of the reduction in the absolute error.

As a final sensitivity analysis, we repeated the linear regression analyses using a different dependent variable, specifically, the time after which there were 16 or fewer ORs in use. We studied 16 ORs because, each day, there are eight anesthesiologists who are assigned to work after 4 pm, if necessary, and each typically medically directs two ORs. We used this secondary end point of the earliest times when there were ≤16 ORs to explore our hypothesis that if the independent variables were poor predictors of the three primary dependent variables, the cause of poor prediction would be that they better predict when many anesthesiologists are working. If most of the add-on cases are performed by the three on-call anesthesiologists and most of the elective (scheduled) cases are performed by the other anesthesiologists, then the numbers and hours of elective cases would poorly predict the hours worked by the three anesthesiologists who are on-call.

Back to Top | Article Outline

Survey of Anesthesiologists

Although we obtained data on three dependent variables, for the survey of anesthesiologists we considered just the smallest (time when ≤2 ORs) and largest (time when ≤6 ORs), thereby reducing the number of questions (Table 4). The survey was performed at the study hospital during a 2-wk period. One of the anesthesia clinical directors asked 32 of 42 anesthesiologists to participate, with 10 missed due to their being on vacation, at a meeting, or working outside the main ORs. All 32 anesthesiologists invited completed the survey but with a few missing values. The two anesthesiologists who were aware of the survey design were excluded. The survey was performed by those two anesthesiologists at the study hospital as a quality improvement project to help them decide whether to post the historical data and to assist in their explaining to the department how the information can be used. The results in this article were implemented at the hospital the week after the survey was completed. The statistical analysis was comparison of observed proportions to ½ (StatXact-8, Cytel Software, Cambridge, MA).

Table 4

Table 4

Back to Top | Article Outline

RESULTS

The study hospital has call team members leaving when the numbers of ORs in use are ≤6 ORs, ≤4 ORs, and ≤2 ORs. The three thresholds studied were the times when these numbers of ORs were always in use. For the three thresholds and two end points of mean and 95% upper confidence limit of the 80th percentile, differences among days of the week were as large as 45 min (Table 1). Differences between end points for the same weekdays were as large as 245 min. Comparatively little additional knowledge was available in the late afternoon on the working day before surgery, as the mean absolute error was reduced by only 4.1–6.0 min (Table 3, Fig. 3). Information available more days before the day of surgery (e.g., 1 wk) would have had even less incremental predictive value.

Figure 3.

Figure 3.

The mean absolute error of anesthesiologists’ estimates for 80th percentiles was 60 min, principally because of underestimation of the 80th percentiles (Table 5). As hypothesized, more than half (69%, P = 0.0003) of anesthesiologists’ estimates for 80th percentiles had error >30 min, whereas errors of this magnitude were less for the mean (44%, P = 0.0004).

Table 5

Table 5

Back to Top | Article Outline

DISCUSSION

Months in advance, anesthesia providers at hospitals frequently choose and trade shifts and make other arrangements for their call days. Anesthesiologists’ intuitions were inaccurate as to the hours that they work (Table 5). Any anesthesia group that bills for its time or tracks its cases has all of the data required for the forecasts, just as they do for specialty-specific staffing.1,2,16 The numbers of simultaneous anesthetics (i.e., ORs in use) at each time can be calculated from the anesthesia start and end times. The results show essentially negligible value to the use of data within 1 day of surgery (e.g., number or hours of cases actually scheduled) other than historical data for the particular day of the week from an anesthesia billing,1 OR information, or anesthesia information management system. The results match previous findings that if months ahead the durations of OR workdays are forecasted appropriately for each specialty, there is financially negligible incremental value for reducing anesthesia group costs by using additional information to even perfectly predict case durations.16,20–23

Although the incremental value of additional data was strongly statistically significant (i.e., P < 10−6) (Table 2), the absolute reductions in forecasting errors were small relative to the choice of the statistic to report (Table 3, Fig. 3). Furthermore, these absolute reductions likely overestimate the actual value of using such information in decisions. The issue is one of trust in technology. When a prediction for how late people work is given before staff schedules are made, by definition the estimate is retrospective and provides little information about what the individual will experience on one specific future evening. Providing the estimate is similar to a web site that gives the average January low and high temperature in Orlando, FL. For a traveler contemplating a vacation there in January, this information may be useful. In contrast, making a prediction on January 16 for the low temperature the next day runs the risk of people perceiving psychologically that “weather forecasts are unreliable.” This is because of a human bias to convert probabilities of events (40% chance of temperature <70°C) into binary perceptions (e.g., “the actual temperature was 65°C, so the meteorologist was wrong”).

Our independent variables captured information about the variables relevant to the working day before surgery. By studying data at this point in time, we likely overestimated the potential value of the information practically useful for regression analysis, because some decisions may need to be made sooner. For example, when staffing (OR allocation) and case scheduling decisions are made based on maximizing the efficiency of use of OR time, most services fill their allocated time no sooner than 3 days before the day of surgery.16,24 At that time, 81% of cases have been scheduled (Fig. 2). By studying data from the day before surgery, we likely underestimated the predictive value relative to using information from the day of surgery. However, by the day of surgery, an anesthesiologist’s desire to work or not to work late for pay is generally irrelevant as to whether the anesthesiologist works late (i.e., they are on-call and have to stay until the cases are done). We showed previously that, under such circumstances, compensation is no longer salient.25 Thus, we cannot identify an argument for why providing updated information on the day of surgery related to the predicted end of shift times would be useful.

Although different anesthesia directors running the control desk make different decisions that affect how late ORs are in use, we did not include individuals as independent variables.26 Who will be scheduled to manage the desk would not be known when an anesthesiologist is choosing his or her schedule. Furthermore, we could not envision informing a person on call that tomorrow she will likely work 30 min later because the person at the control desk is Dr. Smith not Dr. Lawrence. We think that behavioral differences of individuals running a control desk show the value of managerial decision-support systems making recommendations to reduce such variability.23,27 We think that they also show why anesthesia providers will need and/or want to continue to be paid hourly when working late, as compared with occasionally working long scheduled shifts.25

Our data analysis applies only to hospitals with add-on cases and cases being moved among ORs. If every case were known at least 1 wk in advance, there were no add-on cases, each surgeon has a list of cases for the day, and the cases are not moved among ORs, then results would be different. The survey was performed by one anesthesia group immediately before implementation. Although we have no reason to suspect that results would differ among groups, we have no supportive data.

In conclusion, a department’s survey of its anesthesiologists identified a 1 h mean absolute error in estimates for the latest (80th percentile) time that they work when on-call, with more than half (69%) of estimates incorrect by >30 min. Because the two end points of mean and 95% upper confidence limit of the 80th percentile differed substantively, different decisions should rely on different statistics. Because the two end points differed among days of the week, statistics should be reported by weekday. This information can be calculated from anesthesia billing data (or equivalent sources1) and given to anesthesia providers months ahead before staff scheduling. Additional knowledge available on the working day before surgery (e.g., numbers of scheduled cases) had a statistically significant benefit but reduced the mean absolute error by only 4.1–6.0 min extra. The results are especially useful when interpreted along with recent behavioral studies of anesthesiologists’ managerial decisions late in afternoons.25,26

Back to Top | Article Outline

REFERENCES

1. Dexter F, Epstein RH. Optimizing second shift OR staffing. AORN J 2003;77:825–30
2. Dexter F, Epstein RH. Calculating institutional support that benefits both the anesthesia group and hospital. Anesth Analg 2008;106:544–53
3. Ernst EA, Lasdon LS, Ostrander LE, Divell SS. Anesthesiologist scheduling using a set partitioning algorithm. Comput Biomed Res 1973;6:561–9
4. Ernst EA, Matlak EW. On-line computer scheduling of anesthesiologists. Anesth Analg 1974;53:854–8
5. Holley HS, Heller F. Computerized anesthesia personnel system. Int J Clin Monit Comput 1988;5:103–10
6. Dexter F, O’Neill L. Weekend operating room on-call staffing requirements. AORN J 2001;74:666–71
7. Miller RD, Cohen NH. The impact of productivity-based incentives on faculty-based compensation. Anesth Analg 2005;101: 195–9
8. De Grano ML, Medeiros DJ, Eitel D. Accommodating individual preferences in nurse scheduling via auctions and optimization. Health Care Manag Sci. In press. doi: 10.1007/510729-008-9087-2
9. Dexter F, Macario A, Lubarsky DA, Burns DD. Statistical method to evaluate management strategies to decrease variability in operating room utilization. Application of linear statistical modeling and Monte-Carlo simulation to operating room management. Anesthesiology 1999;91:262–74
10. Moore IC, Strum DP, Vargas LG, Thomson DJ. Observations on surgical demand time series: detection and resolution of holiday variance. Anesthesiology 2008;109:408–16
11. Masursky D, Dexter F, O’Leary CE, Applegeet C, Nussmeier NA. Long-term forecasting of anesthesia workload in operating rooms from changes in a hospital’s local population can be inaccurate. Anesth Analg 2008;106:1223–31
12. Wachtel RE, Dexter F. Influence of the operating room schedule on tardiness from scheduled start times. Anesth Analg 2009; 108:1889–901
13. Staudte RG, Sheather SJ. Robust estimation and testing. New York: Wiley, 1990:104–6
14. Clopper CJ, Pearson ES. The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 1934; 26:404–13
15. Newcombe RG. Two-sided confidence intervals for the single proportion: comparison of seven methods. Stat Med 1998;17: 857–72
16. McIntosh C, Dexter F, Epstein RH. Impact of service-specific staffing, case scheduling, turnovers, and first-case starts on anesthesia group and operating room productivity: tutorial using data from an Australian hospital. Anesth Analg 2006; 103:1499–516
17. Strum DP, Vargas LG, May JH, Bashein G. Surgical suite utilization and capacity planning: a minimal cost analysis model. J Med Syst 1997;21:209–22
18. Strum DP, Vargas LG, May JH. Surgical subspecialty block utilization and capacity planning. A minimal cost analysis model. Anesthesiology 1999;90:1176–85
19. Neter J, Wasserman W, Kutner MH. Applied linear statistical models. 3rd ed. Homewood, IL: Irwin, 1990:398–400, 450–1
20. Dexter F, Traub RD. How to schedule elective surgical cases into specific operating rooms to maximize the efficiency of use of operating room time. Anesth Analg 2002;94:933–42
21. Dexter F, Epstein RD, Traub RD, Xiao Y. Making management decisions on the day of surgery based on operating room efficiency and patient waiting times. Anesthesiology 2004; 101:1444–53
22. Dexter F, Macario A, Ledolter J. Identification of systematic under-estimation (bias) of case durations during case scheduling would not markedly reduce over-utilized operating room time. J Clin Anesth 2007;19:198–203
23. Dexter F, Epstein RH, Lee JD, Ledolter J. Automatic updating of times remaining in surgical cases using Bayesian analysis of historical case duration data and instant messaging updates from anesthesia providers. Anesth Analg 2009;108:929–40
24. Dexter F, Traub RD, Macario A. How to release allocated operating room time to increase efficiency. Predicting which surgical service will have the most under-utilized operating room time. Anesth Analg 2003;96:507–12
25. Masursky D, Dexter F, Garver MP, Nussmeier NA. Incentive payments to academic anesthesiologists for late afternoon work did not influence turnover times. Anesth Analg 2009;1622–6
26. Stepaniak PS, Mannaerts GH, de Quelerij M, de Vries G. The effect of the operating room coordinator’s risk appreciation on operating room efficiency. Anesth Analg 2009;108:1249–56
27. Dexter F, Willemsen-Dunlap A, Lee JD. Operating room managerial decision-making on the day of surgery with and without computer recommendations and status displays. Anesth Analg 2007;105:419–29

*For readers who are used to using stepwise and all variables regression, the deleted residuals are those used to calculate the prediction sum of squares.
Cited Here...

†Using regression trees, the mean ± standard error of the reductions in the absolute error were 0.00 ± 0.00, 0.00 ± 0.00, and 0.08 ± 0.01 min for ≤2 ORs, ≤4 ORs, and ≤6 ORs, respectively.
Cited Here...

© 2009 International Anesthesia Research Society