Secondary Logo

Journal Logo

Truth in Scheduling: Is It Possible to Accurately Predict How Long a Surgical Case Will Last?

Macario, Alex, MD, MBA

doi: 10.1213/ane.0b013e318196a617
Editorial: Editorials
Free

From the Department of Anesthesia, Stanford University School of Medicine, Stanford, California.

Accepted for publication November 9, 2008.

Supported by Department of Anesthesia at Stanford.

Reprints will not be available from the author.

Address correspondence to Alex Macario MD, MBA, Department of Anesthesia H3580, Stanford University School of Medicine Stanford, CA 94305-5640. Address e-mail to amaca@stanford.edu.

It is 3 pm. The anesthesia phone in operating room (OR) # 7 rings. “How much longer is left in your case?” asks the clerk at the front desk. This scenario repeats itself over and over in every OR suite in every hospital in the United States, and probably around the world. Why?

We attempt to estimate the time remaining in a case for a number of reasons. Perhaps first among these is a desire to match staff to workload, such that late running rooms have on-call nurses and anesthesiologists assigned to them. The OR manager may also use the time-remaining estimates to help decide whether to move cases from one OR to another, to know when to make the “to-follow” patient ready, and when to have needed supplies and equipment available. Up until now, there was no better way to estimate how much time was left in a case than to ask someone in the room to make their best subjective guess.

This has changed as, in this issue of Anesthesia & Analgesia, Dexter et al.1 continue a long line of intellectual contributions to the field of OR management by now providing the statistical methods necessary to analyze historical case durations with the objective of accurately predicting the expected time remaining for any particular case.

Editorials in scientific journals usually come in one of several forms to: 1) provide new and interesting data/analysis (of one’s own material for example) to supplement the article that was published; 2) explain methodology that is scientifically controversial or particularly difficult to understand; 3) use the editorial as a kind of op-ed piece, a forum in which to make a plea to the professional community; 4) comment on and interpret the methods/findings of the article (this is perhaps the most obvious and common motivation for an editorial); and 5) summarize and illustrate for the reader what is known about the general subject area of the highlighted article. These last two items are the goals of this editorial.

It would be nice to have no uncertainty in case duration prediction. But, it exists. One problem is that, many times when we ask, “How long will the case last?” we are expecting a single numerical answer, such as, for example, “There are 68 min left in the case.” This provides an “illusion of certainty” that feeds a human emotional need for certainty when none exists.2

In fact, by analyzing historical data of cases with the same surgeon and procedure, one can assess the uncertainty surrounding the estimate. In other words, case durations have a probability distribution, such that the expected case duration is not a point value, but rather a probability estimate. Therefore, a more informative answer to the question, “How much time is left” might be, for example, “There is a 67% probability that the case will be finished within 90 min.” This is similar to the approach used in reporting weather forecasts.

The complexity, of course, is that case times are not distributed in a bell curve (Figs. 1 and 2). Often the distributions are right skewed, i.e., unusually long cases (outliers) inflate the average estimated case duration.

Figure 1.

Figure 1.

Figure 2.

Figure 2.

Back to Top | Article Outline

PREDICTING DURATION OF A CASE THAT IS ALREADY UNDERWAY

I expect that in the future all surgical suites will be paperless, and have sophisticated data analysis capabilities like those described by Dexter et al.1 Programming of the OR information system will automatically extract data on who the surgeon is, what the scheduled procedure is, and the case’s actual start time as obtained from the anesthesia information management system server. Ongoing “Bayesian” readjustments of the time remaining will then be derived from how long the case has already been underway (e.g., a case that is 2.5 h old but was scheduled for 2 h).

Bayesian analysis is a term that is increasingly being put in front of clinicians and refers to where both previous observations and new information can be combined to help determine the likelihood of a future event. The data crunching described by Dexter et al. is supplemented, if necessary, by electronically querying OR personnel for time-remaining estimates. These queries are particularly valuable the longer a case goes late, and when there are very few, if any, historical cases to use for predictions.

Using a computerized scheduling system with Bayesian analysis to predict case durations will transform OR scheduling. The OR manager will no longer have to ask how much time is left in a case. Instead, there will be real-time, decision-support that will make recommendations directly to the OR manager to consider: “Move the last case in OR 3 into OR 8,” or “Have the call team relieve the team in OR 7,” for example.

For situations in which the ongoing case is rarely if ever performed, Dexter et al. ’s statistical methods use the scheduled OR time as the principal basis for future predictions. If, however, cases have substantial historical data, then the methods account for the scheduled OR time having a negligible effect on predicted time as compared to historical data.

It turns out that there is no rule of thumb that can be used to approximate time left for cases of a particular length, so the mathematics described in the Appendices has to be coded in the software of a facility’s OR information system or in the anesthesia information management system (as was done by Dexter et al.). Many hospitals currently do not have such an electronic database, but at least in the United States a growing number of academic medical centers are installing information systems in which software could reside.3

The article by Dexter et al. also shows that, as each case progresses, the median time remaining for repetitions of a particular scheduled operation stays roughly constant even if the case goes longer and longer. Many of us would have expected something else: that, as a case goes later and later over its scheduled finish time, the time remaining would decrease to zero. However, the median time remaining for ongoing cases remains relatively constant. This is explained, in part, because progressively more cases have already finished. In addition, a case that goes extraordinarily long may indicate that the procedure being performed is not actually the procedure that was originally booked. Alternatively, there may have been an intraoperative complication or some other issue causing the delay. Many of these events occur randomly.

Back to Top | Article Outline

PREDICTING DURATION OF A CASE THAT HAS NOT STARTED

The second goal of this editorial is to summarize what is known on the topic of predicting case duration, not during an ongoing case, but at the time the case is first scheduled. This is a broader question than just asking how much longer a particular case still has to go. The take home message is that averaging historical case duration data does not increase prediction accuracy for a newly scheduled case as much as one would think or hope. This phenomenon has been made abundantly clear by reports from many facilities that have purchased OR information systems to address chronic complaints about inaccurate case scheduling, only to find that the OR schedule is no better after implementation of such a system.

Why does computerized tracking of case times, and basing estimates of new cases on historical data, not improve the OR schedule as much as might be expected? There are many reasons for this frustrating reality. The usual one proffered is that surgeons purposely bias their estimates. For example, some surgeons chronically under-estimate case duration to “fit” a list of cases into a finite amount of OR time. In contrast, other surgeons purposely over-estimate case durations to keep control/access of the allocated OR time so that it not be given away to another surgeon. Well functioning OR suites should have such systematic biases in case duration estimates of <5 min per 8 h of OR.4

However, a major barrier to accurate scheduling is the combination of a great variety of procedures and the large number of surgeons on most hospitals staffs. Approximately half of the cases scheduled in your ORs tomorrow will only have five or fewer previous cases of the same procedure type and same surgeon during the preceding year. Having such few cases in the data bank makes it difficult to get a better estimate than simply asking the surgeon, for example.

How can there only be a few historical cases on which to base the estimate of a new case? Although the answer may not be intuitive, one way to illustrate the concept is to ask your OR manager how many preference cards (which specify the type of surgical procedure and the specific surgeon) exist at your hospital. A common answer is about 4000 preference cards for a mid-sized surgical suite. Such a suite is likely to perform in the neighborhood of 12,000 cases/yr, or an average of just three cases per preference card on which to base the estimate of the new case!

Another way to estimate the number of repeat cases there are for a given surgeon at a particular hospital is to analyze OR information system data. For each case performed in a 1 yr period, we retrospectively searched and counted the number of previous cases that were of the same type of procedure performed by the same surgeon at an inpatient surgical suite and an ambulatory center.5 Since the surgeon and the surgical procedure are the two most important determinants of surgical time, we grouped cases together if they were of the same procedure type performed by the same surgeon. “Procedure” was defined by Current Procedural Terminology (CPT) code(s).6 If a procedure had more than one CPT code, that set of codes was used to characterize a unique procedure. For example, if phacoemulsification and aspiration of cataract and insertion of intraocular lens are performed as part of one case, then the combination of these two procedures would count as one “procedure” when estimating duration.

We then combined the procedure with a surgeon. For example, all unilateral total knee replacement cases done by surgeon Jones were grouped together. Total knee replacement surgeries done by a different surgeon Smith were grouped separately. A third grouping, for example, was bilateral total knee replacement performed by surgeon Jones.

Yet another grouping was laparoscopic cholecystectomies performed by surgeon Adams. Keep in mind that any laparoscopic cholecystectomies that also included a liver biopsy, for example, were grouped separately as that combination of two procedures defines a unique surgical case.

Prediction of the duration of newly scheduled cases in this inpatient surgery suite was thwarted for 37% of cases (in this example) because there were no cases at all in the previous year of the same procedure type and surgeon (Table 1).

Table 1

Table 1

For the inpatient surgery suite, there were 11,579 cases of 5156 different procedures performed by 225 surgeons with surgical times = 2.5 ± 1.2 h. There were 7217 combinations of procedure and surgeon performed during the year.

For the ambulatory surgery center, there were 4842 cases of 1608 different procedures performed by 160 surgeons with surgical times = 1.1 ± 0.5 h. There were 2245 combinations of procedure and surgeon performed during the year. Twenty-eight percent of these cases in the ambulatory surgery center did not have any cases in the previous year of the same procedure type and surgeon.

Since surgeons typically schedule more than one case into an OR, with a series of consecutive cases, the likelihood that at least one of the cases in that OR will be a surgical procedure that the surgeon has not performed recently (such that no historical data are available) is even higher.7 One late-running case out of the day’s list in that OR can adversely affect the entire day’s schedule.

This issue of uncommonly performed procedures is also apparent nationally.8 Twenty percent of outpatient surgery cases performed in the United States are of a procedure that is performed 1000 times or less per year. Thirty-six percent of outpatient surgery cases may be of a procedure that is performed less than once per facility per year.

There are several potential ways to handle the dilemma of procedures without recent similar cases. The number of historical cases on which to base predictions could be increased by using data from more years, but at the risk that older surgical times may be confounded (e.g., by the learning curve of the surgeon or introduction of new surgical techniques). Lumping together of similar CPT code(s) to increase the amount of historical data is impractical, because procedures with CPT code(s) that differ only in the final (fifth) digit have different case times. For example, vitrectomy (67108) takes more than an hour longer than scleral buckle (67107).

Pooling case duration data from several hospitals could increase the database size from which to base predictions. A study of four academic medical centers providing data for 200,401 cases found that, when a procedure was being performed for the first time at a facility, that same procedure had been performed previously at least once at one or more at the other three facilities only 13%–25% of the time.9

When no historical time data at all are available for a new case, using the mean of the durations of like cases (same scheduled procedure) performed by other surgeons is as accurate (unbiased and precise) a predictor as other, more sophisticated methods to analyze the data.10 Practically though, the simplest approach is to use the booking surgeon’s estimate (with adjustment for typical bias).11

Within a hospital, multiple different procedure types and cases are often counted as one at the time the case is called into the OR scheduling office. This is because the required supplies, instruments and surgical tray may be similar, even though the operation is different. Some hospitals use “mnemonics” to group such cases. Because of the variety of surgical procedures grouped together under one such mnemonic, case duration prediction based on the booking mnemonic will be intrinsically flawed (Table 2).

Table 2

Table 2

Furthermore, comparing surgical times across facilities to benchmark may be quite misleading if the procedure mnemonic groupings at one hospital do not include the same procedure types as the comparator hospital!

Another complicating factor mentioned earlier that interferes with accurate case prediction is the markedly non-Gaussian (nonbell-shaped) statistical distributions of case times. This prevents simply using the mean of historical case durations. Figure 2 is one way to illustrate this quandary as it shows durations for Whipple (pancreatoduodenectomy) procedures.

Why not solve this scheduling problem by automatically grouping together similar procedures with common average durations? The reason is that it is not the average values that matter for most management decisions. In fact, there is no such thing as “the case duration.” The reality of case duration prediction is more complicated. We have to accept that all we can obtain from the data is a probability distribution. It is imperative that requests for proposals for purchasing anesthesia information management systems not be based on single fields for “the scheduled duration.”

Just as there is (hard to predict) variability in patients’ responses to drugs, there is variability in case durations. For some decisions, the OR manager needs to consider the shortest time possible that a case will go. For other decisions, the OR manager needs to consider the longest time possible a case will go.12

I expect that next generation OR management decision systems will better manage the uncertainty associated with case duration prediction. For example, the OR suite of the future may not have patients show up at the same, constant amount of time in advance of planned surgery for all cases. Rather, the time a patient is instructed to arrive at the hospital for their surgery will vary based on the characteristics of the case(s) ahead of them. For example, if patient B is scheduled to follow case A with known duration (and little variability), then patient B can be told to arrive less far in advance of the planned start time. On the other hand, if patient B is scheduled to follow a case or cases with highly uncertain durations (e.g., a Whipple), this patient’s instructions might be, “Come in early,” knowing that an “open and close” procedure is possible.

The work that occurs in surgical suites is complex, and those who attempt to manage that work environment must become more adept at managing complexity. Dexter et al. should be commended on their work to elucidate the role of uncertainty and surgical case durations because OR personnel need to start accepting the truth that case durations are stochastic (a Greek term for “aim” or “guess” in that a state’s next state is determined both by the process’s predictable actions and by a random element). This is in sharp contrast to our preference to think deterministically, believing that, were there enough information we could foresee the future, and thereby estimate case duration to the nearest minute. We must harness the power that information technology gives us to improve upon our own subjective (and often biased) estimates of how our days are likely to play out. Time, in this arena, is more than money; it is a precondition of good patient care.

Back to Top | Article Outline

REFERENCES

1.Dexter F, Epstein RH, Lee JD, Ledolter J. Automatic updating of times remaining in surgical cases using Bayesian analysis of historical case duration data and instant messaging updates from anesthesia providers. Anesth Analg 2009;108:929–40
2.Gigerenzer G, Gaissmaier W, Kurz-Milcke E, Schwartz L, Woloshin S. Helping doctors and patients make sense of health statistics. Psych Science Public Interest 2008;8:53–96
3.Egger Halbeis CB, Epstein RH, Macario A, Pearl RG, Grunwald Z. Adoption of anesthesia information management systems by academic departments in the United States. Anesth Analg 2008;107:1323–9
4.Macario A. Are your hospital operating rooms “efficient?” A scoring system with eight performance indicators. Anesthesiology 2006;105:237–40
5.Zhou J, Dexter F, Macario A, Lubarsky DA. Relying solely on historical surgical times to estimate accurately future surgical times is unlikely to reduce the average length of time cases finish late. J Clin Anesth 1999;11:601–5
6.Opit LJ, Collins REC, Campbell G. Use of operating theatres: the effects of case-mix and training in general surgery. Ann R Coll Surg Engl 1991;73:389–93
7.Dexter F, Traub RD, Qian F. Comparison of statistical methods to predict the time to complete a series of surgical cases. J Clin Monit Comput 1999;15:45–51
8.Dexter F, Macario A. What is the relative frequency of uncommon ambulatory surgery procedures in the United States with an anesthesia provider? Anesth Analg 2000;90:1343–7
9.Dexter F, Traub RD, Fleisher LA, Rock P. What sample sizes are required for pooling surgical case durations among facilities to decrease the incidence of procedures with little historical data? Anesthesiology 2002;96:1230–6
10.Macario A, Dexter F. Estimating the duration of a case when the surgeon has not recently performed the procedure at the surgical suite. Anesth Analg 1999;89:1241–5
11.Dexter F, Ledolter J. Bayesian prediction bounds and comparisons of operating room times even for procedures with few or no historical data. Anesthesiology 2005;103:1259–67
12.Dexter F, Epstein RH, Traub RD, Xiao Y. Making management decisions on the day of surgery based on operating room efficiency and patient waiting times. Anesthesiology 2004;101:1444–53
© 2009 International Anesthesia Research Society