^{1}Time estimates are important for building simulation models of surgical environments

^{2}and for decision analysis based on such simulations.

*individual*and

*aggregate*models in this article for surgeries that have two Current Procedural Terminology (CPT) codes associated with them. Individual models fit a mathematical distribution to surgeries segmented by dual procedure code, surgeon, or type of anesthesia. Modeling in this manner permitted each dual procedure to have its own unique probability distribution and is most useful if there is reason to believe that different dual procedures should have different models. When fitting individual models to subsets of the data, it is important to reduce coding permutations (whenever possible) to avoid unnecessary segmentation and thus excessively small samples with reduced estimation precision.

^{3,4}Times for these surgeries were well modeled by the lognormal distribution. We wanted to test the hypothesis that the lognormal distribution is also the best model for estimating times for two procedures performed in the same surgical session.

^{5}was superior to the normal distribution for modeling surgical and total times. Finally, we also investigated several factors (CPT1, CPT2, surgical subspecialty, type of anesthesia, age, and emergency status) associated with variability in time estimates for dual-procedure surgeries.

#### Methods

^{6}Variables included in the initial data included total procedure time (TT), defined as the time from entry into the operating suite until emergence from anesthesia, surgical procedure time (ST), defined as the time from incision to closure of the surgical wound, age, American Society of Anesthesiologists physical status classification, type of anesthesia, CPT codes, emergency status (Emerg), and surgical specialty category (CAT) as defined using main headers from the CPT classification.

^{7}

##### Detailed Description of the Data

*different*dual-procedure surgeries. TT for the former was 69 ± 48 min, n = 138 surgeries, and TT for the latter was 96 ± 63 min, n = 21 surgeries (mean ± SD). CPT 52000 was cystoscopy and CPT 53899 was urological surgery. Basic statistics were summarized for dual-procedure surgeries (designated CPT1–2) for TT and ST.

##### Individual Probability Models

^{4}indicated that to obtain a better individual model fit, data should be subdivided into (more) homogeneous subgroups by CPT and type of anesthesia (general, local, regional, monitored) prior to being fit to a distribution. To determine the best model for estimating procedure times, samples were segmented by dual-procedure surgery (CPT1–2) and the normal and lognormal models fit to each. Permutations of the component codes were assumed (initially) to represent different surgeries, and each was fit separately. Samples were not segmented initially by type of anesthesia to avoid reducing excessively the sparse number of surgeries to be fit. Other issues related to general lognormal modeling of surgical times have been discussed elsewhere.

^{3,8,9}

^{10,11}to determine whether a data sample was consistent with a normal distribution. The SW test for normality was applied to the logs of the data values, thereby creating a test for lognormality. When using the SW test, one assumes that the null hypothesis is that the model describes the data. Hence, a large

*P*value indicates that it is not reasonable to reject the null hypothesis,

*i.e.*, the data fit the model well. We tested TT and ST for all dual-procedure surgeries with case frequencies of five or more.

*P*value of the SW tests to compare goodness-of-fit tests for the normal and lognormal models. To detect the influence of sample size on the SW tests, we divided the sample arbitrarily into small (n ≤ 30)- and medium (n > 30)-sized samples. Because commonly used levels of significance for hypothesis testing are between 1% and 10% for a test like SW, a frequently used rule of thumb is to regard a

*P*value of at least 0.10 as leading to retention of the null hypothesis (the model fits well) and a

*P*value less than 0.01 as always leading to its rejection (the model fits poorly). We interpreted values between 0.01 and 0.1 as a mediocre fit for the model.

*overall*performance of the individual lognormal and normal models using qualitative (tabular) comparisons. To determine if one distributional model performed better on particular CPT1–2 combinations, we compared the performance of the two models on the same data sets. We used the more graphically oriented normal probability plots to examine those CPT1–2 combinations for which the formal SW tests indicated that both models were inadequate. We also compared the goodness-of-fit of the lognormal and the normal models using the Sign and Friedman tests for TT and ST.

##### Component Procedure Estimates

^{4}for single-CPT surgeries. Specific MTEs were matched to component CPTs for dual-procedure surgeries using lookup tables.

##### Coding Permutations

*provider designated*by combinations of two procedures (CPT1–2), but permutations (order-dependent combinations) were observed in which the order of the same two component CPT codes was reversed (

*i.e.*, CPT1–2 coexisted with similar surgeries CPT2-1).

*t*tests on each dual CPT. For each dual CPT tested, one subset was coded CPT1–2, and the other was coded CPT2-1. The results of the individual

*t*tests (using pooled variances) were tabulated. If all the null hypotheses were true,

*i.e.*, no differences in surgical times existed among permutations, then the

*P*values should behave together like a sample from a uniform (0,1) distribution. We used uniform probability plots and Kolmogorov-Smirnov tests to explore how well the

*P*values were described by a uniform (0,1) distribution.

##### Aggregate Models

^{4,12}If any of the terms Perm, Perm * Anes, or Perm * CPTA-B were significant, then coding permutations have statistical impact on the true mean lnTime, by itself or in conjunction with the type of anesthesia or a particular combination of CPTs.

##### Primary Component Procedure

*e.g.*, for MTE1-2

*versus*MTE2-1. We did this to compare a model with provider-designated codes with another model that ordered CPTs on another criterion, such as the duration of MTEs (

*i.e.*, model 3).

##### Longest Component Procedure

^{12}To study the effect of modeling based on MTEs, we looked up MTEs for CPT 1 and CPT2 and designated the component CPT with the longest MTE as CPTL. This effectively identified the longest component procedure and simultaneously eliminated coding permutations. To test the ability of this duration dependent model to detect variability in lnTT and lnST, we fit an additional seven-factor main effects linear model of the general form:MATH where MTEL = median time estimate for CPTL, MTES = median time estimate for CPTS, Anes = type of anesthesia, CATL = surgical specialty category of the longest procedure, CATS = specialty category of the shortest procedure, Emerg = emergency status, and Age was expressed in years. CATL, CATS, Anes, and Emerg were categorical variables. Factors were added stepwise to the model. It was not feasible to examine interaction effects due to the exploratory nature of our analyses and the relatively large number of independent variables.

##### Simplified Models

^{2}for all the submodels to arrive at a parsimonious model (a reasonably predictive model with as few meaningful terms as required) for predicting lnTT and lnST for models 2 and 3. In particular, we studied models that retained factors as ordered by the original MSEs in the full seven-factor model. In doing so, we computed r

^{2}for all the factor submodels. For reasons of brevity, we reported only those submodels with one, two, three, or seven main effect terms.

#### Results

##### Detailed Description of the Data

##### Model Probability Distributions

Table 1 Image Tools |
Table 2 Image Tools |

*P*value results for TT and ST in tables 1 and 2. Tests on 260 CPT1–2 combinations (3,266 surgeries) revealed that TT fit the lognormal and normal models no better (and no worse) than ST. The lognormal models fit TT and ST better than the corresponding normal model (Friedman tests,

*P*≥ 0.05). The SW tests rejected the lognormal model for only 4–6% of dual CPTs tested.

##### Component Procedure Estimates

*both*component codes (

*i.e.*, MTE1 and MTE2 available simultaneously).

##### Coding Permutations

*t*tests grouped by permutation (pooled variance estimates) could be completed for only 52 of 60 dual procedures for lnTT and lnST because of insufficient numbers for some permutations. Only 5 of 52 dual procedures (9.6%) and 2 of 52 dual procedures (3.8%) differed significantly for the two permutations, with respect to lnST and lnTT, respectively. To put these results in perspective, if the null hypotheses were true and there were no differences among permutations, then 5% of

*t*tests were expected to be positive by chance alone. Probability plots and Kolmogorov-Smirnov tests indicated that the uniform distribution fits the observed

*P*values for both LnTT and LnST (Kolmogorov-Smirnov

*P*values > 0.15 for both).

##### Aggregate Models

*P*< 0.05) with respect to LnTT and LnST. CPTA-B and type of anesthesia were important determinants (

*P*< 0.05) of time estimates for LnTT. The first-order interaction, CPTA-B * Anes, was not tested because too many CPTA-B combinations were associated with only a single type of anesthesia (general). All other first-order interactions were not significant. Results for lnST were similar to those for lnTT.

##### Primary Component Procedure

*P*< 0.05), and together they explained 68.7% of the variability in lnTT. The independent factors in decreasing order of importance by F ratios were MTE1, MTE2, Anes, Emerg, Age, CAT1, and CAT2. The order for the independent factors was the same for a similar analysis of lnST, which is not reported in detail herein. Type III sums of squares were used in the ANOVA.

##### Longest Component Procedure

*P*< 0.05), and together they explained 70.5% of the variability in lnTT. The independent factors in decreasing order of importance by F ratios were MTEL, Anes, MTES, Emerg, Age, CATL, and CATS. Type III sums of squares were used in the ANOVA, and a similar ordering of main effects was obtained for an ANOVA of lnST.

^{2}is slightly better than that of model 2. Furthermore, the relative importance of factors anesthesia and MTE2 were reversed between table 6 and table 7.

##### Simplified Models

#### Discussion

^{13}In an analogous procedure, minimal cost analyses may be used to allocate time to dual-procedure surgeries. Different point estimates may be chosen for varying cost structures, and fitting a statistical model to surgical procedure times is a good way to obtain these estimates.

##### Coding Permutations

*e.g.*, six different permutations are possible from three component CPTs.

^{2}measured for the model with ordered MTEs was numerically greater than for the provider-designated CPTs. Further research might indicate whether there are meaningful subsets of the data for which the model using the provider designation is superior or whether a mathematically more sophisticated model might result in a different conclusion than the one we found.

^{14}using linear statistical models.

##### Limitations

^{4}We elected not to trim outliers from our data because we had no information to support doing so.

^{12}that surgeons differ in variability in surgical times involving a single CPT. We did not include the surgeon as a factor in building our models for dual CPTs. Our data set, although it was a complete census of all surgeries performed at a major hospital over a 7-yr period, was not large enough to permit a model to be estimated using the surgeon as a factor. Other factors known in practice to affect variability in surgical times were also omitted from our models.

^{12}is an important factor whenever the lognormal is superior to the normal distribution for modeling surgeries. We did not examine surgeon effect in this study because too few samples of dual-procedure surgeries each contained enough surgeons with case numbers sufficient to support the analyses. Based on our previous research, however, we believe that surgeon work rate is second in importance after procedure code (and ahead of type of anesthesia) in explaining variability in dual-procedure surgeries.