Secondary Logo

Journal Logo

Economics, Education, and Policy: Statistical Grand Rounds

Quantifying the Diversity and Similarity of Surgical Procedures Among Hospitals and Anesthesia Providers

Dexter, Franklin MD, PhD*; Ledolter, Johannes PhD; Hindman, Bradley J. MD

Author Information
doi: 10.1213/ANE.0000000000000998
  • Free


The first objective of this Statistical Grand Rounds is to review analytical methods for the analysis of the diversity of surgical procedures among hospitals, activities among anesthesia providers, etc. We apply multiple methods and consider their relative reliability and usefulness for perioperative applications, including calculations of SEs. The second objective is to review methods for comparing the similarity of procedures among hospitals, activities among anesthesia providers, etc. We again apply multiple methods and consider their relative reliability and usefulness for perioperative applications. The applications include strategic analyses (e.g., hospital marketing) and human resource analytics (e.g., comparisons among providers).

Measures of diversity (e.g., Herfindahl and Gini-Simpson index) are used for quantification of each hospital or anesthesia provider, one at a time. Diversity can be thought of as a summary measure. Thus, if the diversity of 48 hospitals is studied as described later and shown in Figure 1, the diversity and its SE is being calculated for each hospital.

Figure 1
Figure 1:
There are 6 months of operative procedures performed in children 0 to 2 years of age in Iowa in 2001, from the State of Iowa’s inpatient and outpatient discharge abstract database. For simplicity, throughout the text of the article, we refer to these as cases.b The types of procedures are classified based on the International Classification of Diseases, Ninth Revision, Clinical Modification. Each circle shows 1 hospital’s Herfindahl index. However, the data are plotted in reverse sequence with the most diverse to the right. The Gini-Simpson index equals 1 minute in the Herfindahl index. The (pediatric) hospital in the state performing the largest number of different types of procedures is shown using a red circle. The hospital performing the most procedures is shown using a blue circle, far to the left because those procedures are of few different types of procedures. The figure is limited to the 48 hospitals with enough procedures over the 6 months of data for the SE of the Herfindahl to be <0.075. Adapted from Figure 2 of Ref. 6.

Quantifying diversity of procedures is important, because it influences appropriate operations management. For example, standardization (modularity) of processes should not be expected at perioperative organizations with large diversity. Software should be expected to need to be appropriate for many surgeon preference cards, for predicting durations and equipment for rare procedures, and for customization of most patient care instructions. Marketing messages can focus on the majority of patients undergoing many different rare procedures not the minority of patients undergoing common procedures.

In contrast, measures of similarity are pairwise assessments. Thus, if quantifying the similarity of procedures among cases with a break or handoff versus cases without a break or handoff, a similarity index represents a correlation coefficient. There are several different measures of similarity, and we compare their features and applicability for perioperative data.

Quantifying similarity is important for knowing whether 2 hospitals or anesthesia groups in the same region compete (i.e., the degree to which their procedures overlap). Similarity of procedures among patients leaving a region to have surgery versus having surgery locally indicates opportunities for local service expansion. In this article, we show too the novel use of similarity indices for determining when groups to be compared (e.g., cases with/without anesthesia provider breaks) have balanced (i.e., matching) distributions of procedures.


Herfindahl, Gini-Simpson Index, and Their SEs


represent the proportion of cases performed at the


that are of the


. The notation S is used, because the population of many unique procedures (or combination of procedures) is analogous to populations of different species in ecology.1,2 For example, the number

can be all Current Procedural Terminology (CPT) codes for “invasive therapeutic surgical procedures” from the U.S. Agency for Healthcare Research and Quality.a For example, previously we observed at a U.S. academic hospital (j = 1) that, when classifying each scheduled surgical case by its primary procedure code, there were

scheduled procedures in their local dictionary.3 Among 8108 patients who were inpatients preoperatively and their cases were cancelled, 1.38% of the cases were scheduled for percutaneous nephroscopy (i.e.,


Consider an urn representing the

facility. Each ball in the urn represents a surgical case. Each ball is labeled with the procedure that was performed. If a procedure was performed 5 times, then 5 balls in the urn are labeled with the same procedure. Shake the urn. Draw 1 ball from the urn and record the procedure labeled on the ball. Return the ball to the urn. Shake the urn. Draw out another ball from the urn. Compare the procedure labeled on that second ball with the procedure of the first ball. The Herfindahl index

for the

facility equals the probability that the 2 balls drawn are labeled with the same procedure:

In ecology, this is called Simpson’s index. The smaller the value of the Herfindahl index (i.e., the Simpson index), the greater is the diversity of procedures. Let

represent the number of procedures that are performed at the


. The maximum value of

, which is obtained when only 1 procedure is performed at the hospital, is


. The minimum value

is obtained when all procedures performed are equally likely (i.e.,


The Gini-Simpson index5 equals

. The Gini-Simpson index is intuitive because greater values show greater diversity. An estimate for the Herfindahl of the


the second term differing in being the summation to

rather than

. The corresponding estimate for the Gini-Simpson index equals

. These are maximum likelihood estimates.6,7

Figure 1 is adapted from a figure we published in 2003, quantifying the diversity of operative procedures performed on infants and toddlers at each of

different hospitals in the State of Iowa.6,b Instead of plotting the Herfindahl index as in the original figure, in Figure 1, we use the Gini-Simpson index. The (pediatric) hospital performing the greatest number of different procedures is shown using a red circle (j = 2):

and Gini-Simpson index = 0.931 ± 0.009. The SEs were calculated using Equations 2 and 11 in the Appendix. The 0.069 means that there was a 6.9% ± 0.9% chance that any 2 randomly selected cases at the

hospital were of the same procedure. In comparison, at the hospital in Iowa performing the greatest total number of cases among young children, the chance for both procedures being the same was 65.5% ± 2.2% (i.e.,

and Gini-Simpson index = 0.345 ± 0.022). This j = 3rd hospital is shown in Figure 1 using a blue circle.

Comparing among the hospitals in Figure 1, the large pediatric hospital (red circle) has significantly greater diversity of procedures (Gini-Simpson index, 0.931) than any other hospital. Because the SE can be calculated for the index, inferential analysis can be performed. The pairwise differences between the large pediatric hospital and each of the other 47 hospitals in Figure 1 are all significant with Bonferroni-corrected P < 0.0001. To interpret the numbers in the Figure 1, consider that the large pediatric hospital (red circle) had

observed procedures. In contrast, the j = 3rd hospital performing the greatest total number of cases (blue circle) had only

observed procedures, principally myringotomy tube placement and adenoidectomy.b

Because the SEs of Herfindahl indices are needed for use in comparing hospitals, we focus in this Grand Rounds on different methods for calculation of the SEs. In contrast to the simple Equation 11, an alternative Equation 12 derived by Taplin uses additional terms.8 The SEs calculated using these 2 methods differed; however, by only a very small amount (<0.0001). To illustrate the negligible differences, we used a data set, from a perioperative application, with a much smaller sample size. Over 56 weeks at an ambulatory facility (j = 4), 12 anesthesia providers started 1947 cases during which there was at least 1 break or handoff.913,c Rather than calculating the Herfindahl index based on the distribution of cases among procedures, we assessed the diversity of cases with breaks among the 12 anesthesia providers:

. Thus, there was a 14.42% chance that any 2 cases, in which a break was given or handoff occurred, were started by the same anesthesia provider. The SE calculated using Equation 12 was 0.215%. The SE calculated using Equation 11 differed by just 0.001%.

Effective Number of Common Procedures

Esophagogastroduodenoscopy is an example of a common procedure.14 Anoplasty and anorectal myomectomy are examples of rare procedures.2,15,16 Because of such rare procedures, the observed (sample) number of different procedures

is not a reliable estimate for the actual number of different procedures performed at a facility, because of rare procedures.

Figure 2 shows the probability distribution of different procedures performed during outpatient surgery in the United States with an anesthesia provider.1,17,cFigure 2 is adapted from Ref. 1.2 The few very common procedures are performed >100-fold more often than the many rare procedures (Fig. 2), because there are thousands of different procedure codes and combinations.1,2 Thus, most sample estimates for the number of different procedures miss at least some rare procedures that are performed infrequently at each facility.1

Figure 2
Figure 2:
The graph is of the probability distribution of procedures among outpatient cases performed in the United States with an anesthesia provider.1 The data are from the 2004 to 2006 National Survey of Ambulatory Surgery.1 , 2 , 15 , 17 The procedures are classified based on the International Classification of Diseases, Ninth Revision, Clinical Modification, (ICD-9-CM). Each combination of ICD-9-CM was treated as a different procedure, because each was performed during the same surgical case. There were 24,084 procedures of 228,332 cases. However, the survey used probability sampling so that nationally representative results could be obtained without surveying every ambulatory surgical case in the United States.1 , 2 , 15 , 17 The National Center for Health Statistics assigned each case a weight. For example, some cases had weights of 10 (i.e., represented 10 outpatient cases nationally), and others had weights of 20,660 (i.e., represented 20,660 outpatient cases nationally). The histogram shows nationally representative results by calculating, for each observed procedure, the sum of the weights of all cases of the procedure and then dividing by the sum of the weights of all cases. The figure shows that most procedures occur >100-fold less often than the few common procedures. Adapted from Figure 3 of Ref. 1.

Primary surgical procedures classified by the CPT were reviewed for the 16,413 cases performed by anesthesia providers at a hospital in Iowa (j = 5) over 7 successive 8-week periods.d The number of different procedures during each of the 7 eight-week periods was 769, 794, 820, 855, 878, 887, and 930, respectively. In contrast, using all 56 weeks together, there were

different procedures. The reason for this vast difference was that most of the 2086 procedures were performed just once or twice. For example, during the 8-week period with 769 procedures, 72.2% were performed just once or twice. Among the other 6 periods, the percentages were 72.2%, 72.3%, 72.3%, 73.0%, 73.1%, and 74.1%. In fact, because these are primary surgical CPT codes billed for anesthesia, we know from the dictionarye that

(i.e., even the

from 56 weeks was an underestimate). Figure 2 shows that this behavior is not an artifact and cannot be overcome by pooling case duration prediction data among facilities (see section Limitations to Quantifying Diversity: Example of Case Duration Prediction).

The inverse of the Herfindahl

has good interpretive value for the numbers of common procedures, numbers of providers performing cases, etc. For example, suppose that among

possible procedures, 6 procedures were performed 4 times at the j = 6th hospital, and 2 procedures were not performed. Then, among the 24 total cases,

, and

. The estimate of the Herfindahl index

. The inverse of the Herfindahl

equals 6 procedures. The inverse is also referred to as the Hill diversity measure of order 2, as explained in Equation 15 of the Appendix. The diversity measure is of order 2, because the Herfindahl uses the square of the proportions. For the j = 6th hospital,

, matching the number of procedures. Next, suppose that at the j = 7th hospital, there are also

possible procedures and 24 cases, but 3 procedures each accounted for 7 cases, 3 procedures each accounted for 1 case, and again 2 procedures were not performed. The estimate of the Her findahl index

. Its inverse

. The diversity measure (3.84) is >3 because there are >3 procedures, but not 4 or greater because 3 procedures account for most cases.

Figure 3 shows the same pediatric data as Figure 16,18 but now plotted as the inverse of the Herfindahl.b The greater diversity of procedures at the large pediatric hospital (red circle,

) versus the hospital performing the most cases (blue circle,

) is even more apparent in Figure 3 than in Figure 1:


, respectively. These estimates can be compared with the corresponding number of different procedures performed:


. Thus, the inverse Herfindahl values of 14.47 and 1.53 are not estimates for the total numbers of all possible procedures at a hospital, but instead approximate the number of procedures performed (i.e., observed) commonly.5Figure 3 shows that the estimates

are sufficient for making comparisons of the diversity of procedures among hospitals in units of numbers of cases.

Figure 3
Figure 3:
Inverse of the values from Figure 1. The figure shows the same pediatric data from the State of Iowa as does Figure 1, but plotted as the inverse of the Herfindahl:
Figure 1
. The inverse is the diversity measure
Figure 1
of order 2, with the 2 being the proportions squared. The inverse of the Herfindahl has units of number of different types of procedures.b The hospital in red is the large pediatric hospital in the state. The hospital in blue performed the largest number of procedures, but of few different types of procedures. The figure shows that what makes a pediatric hospital unique is performing a large diversity of types of procedures.6 Similarly, in the state of New York, the diversity of diagnosis related groups was significantly greater at hospitals with accredited pediatric residencies than at hospitals caring for children but without such a residency.18 Adapted from Figure 3 of Ref. 1.

The relative (logarithmic) relationship of procedure incidences (Fig. 2) has an important consequence for the measure of the number of common procedures obtained by using the inverse of the Herfindahl

. Each increase in the total number of procedures

(i.e., species richness) results19 in an increase in the value of

. This relationship is reasonable intuitively,20 and we have used it when explaining the results of the analyses.6 We consider this topic more, below, in the Section “Diversity Assessed with Weighting for Differences Among Procedures and Providers.”

Other measures of diversity are used often, especially the diversity index of order 1:

This measure is the exponential of the Shannon entropy. See Equations 16 to 18 of the Appendix. The

effectively counts5 more of the procedures than does the inverse of the Herfindahl,

. In the aforementioned example with

, the

, matching the 6 performed procedures. Similarly,

procedures. In contrast, with



, which is less than


Continuing with examples to show that

we use data from the

hospital, mentioned earlier, which performed the greatest number of different procedures. The inverse of the Herfindahl index

. The exponential of the Shannon entropy

. The

was less than the 53 procedures performed at least 3 times, which was less than

, which was itself less than the 86 procedures performed at least 2 times. For the other pediatric hospital in Iowa (j = 8), the inverse of the Herfindahl index

and the exponential of the Shannon entropy

. The

was less than the 13 procedures observed at least 3 times, which was less than

, which was less than the 25 procedures observed at least 2 times.

We have not routinely used the exponential of the Shannon entropy

because its estimates can be sensitive to the data (i.e., unreliable), whereas the Herfindahl (i.e., Simpson index) index works well for the perioperative applications (Figs. 1 and 3). For any number of observed surgical cases (i.e., sample size), the mean square error of the (standardized) Shannon entropy can be 2 orders of magnitude greater than that of the Gini-Simpson index.21 The sample (maximum likelihood) estimate of the Shannon entropy has large bias when the total number of different procedures in the population,

, is comparable to the sample size.22–24 We showed previously that for surgical procedures, the estimated total population size

, including (as necessary) meaningful combinations of procedures, is almost precisely equal to twice the sample size.1 The exponential of the Shannon entropy has its minimum SE when all procedures are equally frequent (i.e., uniform distribution),22 entirely different from the logarithmic distribution characteristic of counts of procedures (Fig. 2). For example, at the large pediatric hospital, the minimum variance unbiased estimator of the inverse of the Herfindahl (Equation 13)

, which is very similar to the maximum likelihood estimate

, that we used earlier. In contrast, the low bias estimator for the exponential of the Shannon entropy (Equation 18) was

, which is very different from

. For another example, at the small pediatric hospital, the minimum variance unbiased estimator of the inverse of the Herfindahl was

, very similar to

. In contrast, the low bias estimator for the exponential of the Shannon entropy was

, very different from

. It is not that


) is inherently more valid than


), because they are measuring different things.21 However, when analyzing data actuarially or automating reports by service, reliability is important.25,26

Diversity of Procedures and Providers at Single Hospitals

Over 60 consecutive weeks, 50 anesthesia providers performed 17,902 cases at the j = 5th hospital.f Once again analyzing diversity among anesthesia providers rather than procedures, the inverse of the Herfindahl was

anesthesia providers. The value of 39.75 is less than the number (50) of anesthesia providers, because each provider did not perform an equal number of cases. Specifically, the 38 anesthesia providers performing the most cases each accounted for at least 1.05% of the 17,902 cases and the 39th accounted for 0.91%. Among the 41.9% of cases with at least 1 break or handoff, the inverse of the Herfindahl was 39.11 ± 0.38 anesthesia providers. For the remaining cases in which no break or handoff occurred, the inverse of the Herfindahl was 39.67 ± 0.32 anesthesia providers.g,27 Thus, because the inverses of the Herfindahl were essentially the same, there was no difference between the diversities of anesthesia providers among cases (1) with breaks and (2) without breaks. We consider more applications to human resources analytics later.

By using cases performed at the j = 5th hospital by anesthesia providers, we compared the procedures among cases with breaks versus cases without breaks. There were 2132 procedures observed. The Herfindahl indices were 0.0033 ± 0.0001 for cases with breaks and 0.0089 ± 0.0003 for those without breaks. Both Herfindahl indices were very small. However, the diversity of procedures was significantly greater among cases with breaks than without breaks, P < 10–6. The inverses of the Herfindahl indices were 299.3 ± 10.1 procedures for cases with breaks versus 112.6 ± 4.0 procedures for cases without breaks. Thus, cases in which breaks took place were derived from a greater diversity of procedures than were cases in which no break occurred. When the 3 most common procedures were excluded, the Herfindahl indices were now essentially identical between cases with and without breaks: 0.0034 ± 0.0001 and 0.0033 ± 0.0001, respectively. The 3 most common procedures were performed mostly (95.4%) without breaks, likely because those cases were very brief: electroconvulsive therapy (CPT 90870), extracapsular cataract extraction (66984), and follicle puncture for oocyte retrieval (59870).h We consider these data further, below, in the sections Comparisons of Similarity Indices, Based on Data from Within a Single Hospital and Sensitivity Analyses to Interpret an Observed Value of the Similarity Index

. The absolute differences in the Herfindahl indices (0.0033 vs 0.0089) were small because these 3 most common procedures collectively accounted for only 8.28% ± 0.21% of cases.

Diversity Assessed with Weighting for Differences Among Procedures and Providers

As explained in the section “Effective Number of Common Procedures, when all procedures are equally prevalent, the inverse of the Herfindahl (i.e., Hill number of order 2) equals the number of procedures. Although this relationship seems intuitively reasonable,20 the count of individual procedures does not take into account the additional information from knowing the clinical characteristics of the procedures.28 For example, among the 7315 cases at the j = 5th hospital and with at least 1 break (or relief), the 3 most common procedures were total knee arthroplasty (2.4%), total hip arthroplasty (2.2%), and laparoscopic cholecystectomy (1.2%). For the preceding analyses, the 2 arthroplasty procedures and laparoscopic cholecystectomy would be counted as 3 distinct procedures even though the 2 arthroplasty procedures are similar.

Using that proportions sum to 1, the Gini-Simpson index (e.g., as shown in Fig. 3) can be rewritten:

Rao’s quadratic entropy equals5,29:


is the conceptual (unitless) difference between the 2 procedures,

. Two procedures that are nearly identical have a difference

0, whereas those of different specialties and anatomic region would have a difference

1.29 There is a corresponding minimum variance unbiased estimator.5 Furthermore, there is a corresponding unbiased estimator for the effective number of common procedures that results from this measure (Equation 4).5 However, we have not used the weighted measure in our perioperative studies, for several reasons.

First, for strategic analyses, we have found presentation of results to nonscientific audiences to be important.6,25,26 In this context, sensitivity of calculated results to the method of measuring the differences between procedures has seemed a limitation, because it involves medical understanding often lacking among stakeholders. For example, we often use the Clinical Classifications Software to describe subspecialty,26,30–32 in part, because it is available for International Classification of Diseases, Ninth Revision, Clinical Modification, International Classification of Diseases, Tenth Revision, Clinical Modification, and CPT.i Both donor pneumonectomy and sleeve pneumonectomy are procedures in the Clinical Classifications Software of “lobectomy or pneumonectomy.” However, there are major anesthetic and surgical differences between these procedures. Likewise, there are major anesthetic and surgical differences between partial nephrectomy versus radical nephrectomy with vena caval thrombectomy. To the extent that both donor pneumonectomy and partial nephrectomy have 7 American Society of Anesthesiologists Relative Value Guide Base Unitsj (i.e., are not physiologically complex),6,25,30,33–35 these 2 procedures may actually be less different from one another than the other 2 examples of pairs of procedures from the same specialty. We have instead been addressing medical issues by performing sensitivity analyses, focusing on individual common procedures.6,25,26,30–32,35

Second, conclusions, as exemplified by all our results mentioned earlier, seem unchanged. For example, consider the comparison of the large pediatric hospital

and the hospital performing the greatest total number of cases among young children

(Fig. 3). Because the latter hospital performed almost exclusively (99%) outpatient pediatric otolaryngology, whether the functional similarity among these procedures were based on specialty or American Society of Anesthesiologists base units, the diversity for this

hospital would be less when adjusted for the functional similarity of the procedures. Consequently, the greater diversity of the large pediatric hospital

would be emphasized further. This seems unnecessary from Figure 3. In other words, the analyses of the diversities of procedures are not seeking subtle differences among hospitals or groups but quantitative ways of summarizing substantial categorical data.

Limitations to Quantifying Diversity: Example of Case Duration Prediction

Although comparisons of diversity among hospitals have strategic value (Figs. 1 and 3),6,25,26,35,36 a limitation to quantifying diversity of procedures within single hospitals has been that, once appreciated, other statistical methods then are used for administrative decision making. For example, the initial application of diversity measures for perioperative management was case duration prediction.1,2,15,37–46 Twenty percent (SE 1%) of outpatient surgery cases performed in the United States between 1994 and 1996 were of a procedure that was performed annually 1000 times or less nationwide.2 Nearly half of cases at a comprehensive hospital were procedures scheduled by the surgeon <9 times in 3 years.47 Rare procedures account for most of the uncertainty in case duration-associated decisions,16 in part, because many decisions depend on the time to complete a list of cases in an operating room on a day. Even though only 4% ± 1% of endoscopy cases were rare, they were distributed among 13% ± 3% of lists.14 Nearly half (49% ± 0.4%) of lists of cases at a comprehensive hospital had at least 1 case performed only once by the surgeon in 3 years.48 For decisions related to the longest amount of time that cases take,46,47,49–52 rare procedures are even more consequential,39,46 because more data are needed to estimate SDs than means. Pooling data among surgeons helps little, because rare procedures tend to be rare for all surgeons,1 and surgeons differ in how quickly they operate37,53–56 (e.g., because of different surgical approaches and methods).57 Thus, the diversity of procedures matters a lot for case duration prediction. However, knowing this about diversity, the problem then can be bypassed statistically by relying on the surgeon’s scheduled duration as an expert judgment”16,56 predicting the median (or mean) duration of the rare (or common) procedure.39,44,54,58 The process variability around that prediction (i.e., the coefficient of variation) is estimated using data from similar procedures.16,39,44,46 Such methods are remarkably accurate.39–42,45,46


Descriptions of Similarity Indices

The similarity measure that we have used in previous studies is that from Yue and Clayton7 because of its natural probabilistic interpretation for surgical procedures.6,26,30 Consider 2 hospitals. Suppose that the CPT manual contains

different procedures. Select 1 procedure at random from the CPT manual (e.g., 69421 myringotomy tube “requiring general anesthesia”). Let event A be that the procedure is 1 of the

procedures observed at the j = 1st hospital. Then

. Let event B be that the procedure is one of the

procedures observed at the j = 2nd hospital. Then

. Let

represent the probability that the randomly selected procedure is one of the

procedures present at both facilities, given that it is present at least at 1 of the 2 hospitals.59 By definition,

would differ among pairs of hospitals, but always


. Using set (Venn diagram) notation, a reasonable similarity measure (that we do not use routinely, however) would be:

where the

means intersection (i.e., present in both and thus shared), and the

means union (i.e., present in either). The numerator equals

when A and B are independent.

To apply Equation 5 to the j = 1st and j = 2nd hospitals5,7,59:

We estimate

in Equation 6 by replacing the proportions

with their sample estimates. To interpret Equation 6, and by analogy Equation 5, suppose that all procedures performed at the first hospital were done equally often at the hospital and that the same is true for the second hospital too.7,59 In other words,

among the

procedures performed at the first hospital, and

. Substituting these expressions into Equation (6), and multiplying the numerator and denominator7 by



is used to represent Jaccard, as this is the Jaccard index. It equals the ratio of the number of shared procedures to the number of unshared procedures.5,59 Although the


are used often in ecology, and the Jaccard index makes great sense, below we show for our applications that these indices have limitations. (Note that nowhere in this Grand Rounds, do we use the phrase “in common” and instead “shared,” so that “common” can be used in terms of frequency [i.e., “common procedures”]).

Dexter et al.6 and Yue and Clayton used7:

To interpret the numerator of Equation (8), we again use urns, just like for the interpretation of the Herfindahl. Envision 2 urns, 1 representing each hospital. In each urn are balls, representing all the various cases. Each ball is labeled with the case’s procedure. If a procedure was performed 5 times, then there are 5 balls in the urn that are labeled with that procedure. Shake the urns. Draw 1 ball from each of the 2 urns. The numerator is the probability that the procedure labeled on the first ball is the same as that labeled on the second ball. The denominator normalizes the range to be from 0.0 (when there is no overlap of the procedures) to 1.0 for

, for all


Suppose now that, as immediately above, all procedures that are performed at each facility are done equally often:

for the

procedures performed at the first facility, and, similarly,

. Substituting into Equation 8, and multiplying the numerator and denominator by


, as desired (i.e., for this special case,


The nonparametric maximum likelihood estimator for

is obtained by replacing the true proportions

in Equation 8 with the observed proportions

that are defined in Equation 10:

Characteristics of the Similarity Index θ Based on Previous Work with Statewide Data

The SE of the similarity index

can be calculated asymptotically by using Equation 20 in the Appendix. The estimated SEs are similar (within 0.001) to those calculated using bootstrapping.6,7 For example, we calculated the similarity of procedures performed between 2 pediatric hospitals in a certain state:


.6 The




, and

.b The 2 SEs were 0.055 vs 0.052, differing by just 0.003. When we compared the similarity between the smaller pediatric hospital in the state (

) to the hospital performing the most pediatric cases (

),6 there were




, and

. The 2 SEs were 0.040 vs 0.041, differing by just 0.001.

For purposes of comparing procedures between groups, values of the similarity index

that are relatively large (≥0.8) are known, in part, from analyses of state discharge abstract data.6,26 For example, a hospital’s (j = 9) anesthesia department was concerned about patients leaving its small community to undergo surgery in a nearby large metropolitan area. The hospital’s similarity was compared with 134 other hospitals in the state.26Figure 4 shows the 134 pairwise comparisons with the

9th hospital.26 The hospital to which it had the greatest similarity (and hence competition) was the one other hospital in its community:

. Thus, the principal competition was not the collective effect of many distant hospitals, as presumed, but was, instead, the unrecognized result of the surgeons having privileges at both of the local (similar) hospitals.26

Figure 4
Figure 4:
“Similarity indices comparing” the hospital
Figure 1
“with [the] 145 other hospitals in [its] state that performed at least 1200 surgical procedures in the year.” The state from which the discharge abstract data were used was not listed deliberately.26 In the text of the article, we refer to counting cases of different procedures. In this figure as in Ref. 26, we count procedures each of a certain type. The types of procedures were classified using International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) procedure codes. The “hospitals are shown in ascending order based on the number of procedures performed in the year,” from 1203 at the far left to 34,063 at the far right. From the study, similarities of 0.8 and 0.3 are appropriate cutoffs for similarities considered exceptionally large or small.26 Only a single hospital, which was in the same county as the studied “hospital”
Figure 1
had a similarity ≥ 0.8. All similarity SEs were ≤0.03. Adapted from Figure 5 of Ref. 26.

Another example provided insight into what values of the similarity index

are small (<0.3).26 At a large hospital (

), leadership perceived that it was competing for patients undergoing the same procedures with the small local community hospital (

).26 However, the 2 hospitals were quite dissimilar:

. The 2 hospitals had evolved different practice mixes with different procedures26 (i.e., were not competing for patients undergoing the same procedures).60

Data from the

hospital was used to evaluate the sensitivity of the similarity index to rare procedures (i.e., those for which observed frequencies may underestimate true frequencies in the population).26 The hospital was compared pairwise with each of the 48 comparably sized hospitals in its state.26 For each pair of hospitals, omission of the most common procedure that both hospitals shared produced the greatest decrease in the similarity index for all hospitals.26 Omission of each of the 3 most common procedures produced the greatest changes in the similarity index for 3 quarters of the 48 comparisons.26 In contrast, omission of each of the 3 least common procedures produced no changes in the similarity index to 4 decimal places. Thus, procedures that are both common and shared between hospitals influence the similarity index

, whereas rare procedures have practically no influence.26 These findings will guide results of the next section with different types of data, all coming from within a single hospital.

Comparisons of Similarity Indices, Based on Data from Within a Single Hospital

Comparison was made between anesthesia providers’ cases in which there was or was not at least 1 break or handoff (Fig. 5). With 60 weeks of data from hospital


procedures were classified based on the primary surgical CPT codes. We compared the procedures of the 7315 cases that had at least 1 break (or relief) versus the procedures of the 10,141 cases that did not have a break or relief (i.e., the entire case took place with 1 anesthesia provider). Logically, cases of brief durations were more likely to have 1 anesthesia provider (i.e., be among the 10,141 cases without a break). The most common procedure overall was electroconvulsive therapy,h accounting for just 0.2% of cases with breaks versus 5.5% of those without breaks. The second most common procedure in Figure 5 was extracapsular cataract removal with insertion of intraocular lens prosthesis. Again, this is a brief procedure.h Consequently, the procedure accounted for 0.6% of cases with breaks and 4.6% of those without breaks. The same relationship applied to the third most common procedure, follicle puncture for oocyte retrieval, as well as the fourth, fifth, and sixth procedures. The seventh (laparoscopic cholecystectomy) most common procedure and all others each accounted for <1.0% of all cases. Given that the 6 most common procedures had substantial heterogeneity between cases with and without at least 1 break (e.g., because of anesthetic durations), there was small26 (<0.3) similarity of procedures between cases with breaks versus without breaks:

(Fig. 5).

Figure 5
Figure 5:
Distribution among different procedures of anesthesia providers’ cases with and without breaks or handoffs. The different procedures are classified by the primary surgical Current Procedural Terminology code. The figure includes the 208 of 2132 procedures accounting for at least 0.1% of all cases at hospital
Figure 1
This results in the inclusion of 54.7% of the 7315 cases with at least 1 break (or relief) and 62.7% of the 10,141 without. The line in green shows 1:1 relationship (i.e.,
Figure 1
). Scatter in the figure displays graphically that the similarity is small26 (<0.3):
Figure 1
. Pay more attention to the procedures with incidences >1.0%, because the data are presented on a log scale.

Our focus in this section of the article is the comparison of similarity indices.k,30,61 The small26 similarity value of

seems reasonable looking at Figure 5. In contrast,

, a much greater value that suggests large26,60 similarity. Yue and Clayton7 previously showed the cause for this disparity.

gives unusually large values when shared procedures range in frequencies from very common to rare, which is precisely what holds for surgical procedures.6,25,35 This is one reason why we have been using

rather than

to assess similarity.26,30

Chao et al.62 developed a correction to the

for procedures performed 0 or 1 times in each or both of the groups compared (e.g., hospitals as for Fig. 4 or cases with versus without breaks as for Fig. 5). See Equation 22 in the Appendix. Chao et al.62 found that their “adjusted estimate [was] always higher than the corresponding unadjusted one because of the presence in sample pairs of observed, shared, rare species.” Because there were many rare surgical procedures at the

hospital,25,35 with the adjustment, the resulting estimate was greater: 0.99 vs

. As described earlier, this resulted in even less relevance for our specific problem, because it was not that there was a good quantitative match (see preceding paragraph, Figure 5, and the section, above, Diversity of Procedures and Providers at Single Hospitals).

Just as for

, the Jaccard index (Equation 7) can overestimate the desired similarity, because both common and very rare procedures are treated equally, even though, for our perioperative applications, their abundances (frequencies) may differ >100-fold (Fig. 2). For the preceding data,

. That is not quite as large as

but still substantially greater than the more appropriate small26 value of


The other problem with the Jaccard index is what to do when all procedures are present in both groups, as then


. For example, among the cases performed by anesthesia providers, and with at least 1 break (or relief), the first (or only) break occurred before or within 12 minutes of the start of the surgical procedure (“early”) for 39.9% of the cases versus “late” for the remaining 59.1% of cases. The rationale for evaluating 12 minutes or less was from our previous study investigating when documentation of anesthetic events (e.g., of intubation and of drug administration) was complete.61,l For this example, each individual anesthesia provider was used in lieu of the specific procedure (Figs. 4 and 5). There were 50 anesthesia providers working at the hospital during the studied period. As shown in Figure 6, the similarity among anesthesia providers was large26,30:

. In other words, among cases with a break, there were only small differences among anesthesia providers in the timing of the break, early versus late. However, as shown by Figure 6, the relationship was not exact, unlike as expressed by

. Thus, the Jaccard index is not useful for this application because whether every anesthesia provider has at least 1 early break is essentially irrelevant.

Figure 6
Figure 6:
Comparison of the timing of breaks among anesthesia providers. There were 50 working at the hospital during the studied period: Sunday, December 1, 2013, through Saturday, January 24, 2015. Each symbol represents 1 anesthesia provider. Along the horizontal axis is the number of first (or only) breaks during cases started by the anesthesia provider and for which the break was initiated before surgical incision or within 12 minutes of incision. Along the vertical axis is the count of the other first (or only) breaks during the case. The data of 3 anesthesia providers surrounded by square dotted lines are the outliers described in the text. There were 7 anesthesia providers who performed more than half their year’s cases at the ambulatory facility. They are shown in red. By individual case, anesthetic duration, facility, or the time of the day were not predictive for many cases. Specifically, by classification tree, using multiple binary endpoint criteria, none of these 3 were selected. Thus, the observation about the apparent predictive effect of facility seems to be a pattern associated with routinely working at the facility, less so something at the level of the individual case. When repeating analysis by anesthesia provider as in the figure, facility was more predictive than mean duration among the anesthetist’s cases.

Yue and Clayton developed an unbiased estimator of

, which is an alternative to the nonparametric maximum likelihood estimator of Equation 9. See Equation 23 in the Appendix. We use the maximum likelihood estimator, because there is no corresponding analytical value for the SE of the unbiased estimator, and the differences are minor. Comparing the procedures of cases with and without a break, the

. The corresponding unbiased estimate was 0.26. Comparing anesthesia providers with breaks before or within 12 minutes of the start of the surgical procedure versus afterward, the

. The corresponding unbiased estimate was 0.95. See the Appendix for explanation why the unbiased estimators are greater.

All the measures of similarity that we considered as alternatives to

resulted in greater estimates than those of

for the above applications (Equation 9). Although we do not routinely use these alternative measures for the reasons described earlier, we remain cognizant that

may underestimate similarity. Furthermore, we emphasize that the 2 other indices have advantages for ecologic settings. Our groups shown were hospitals (Fig. 4), presence of breaks (handoffs) (Fig. 5), and timing of breaks (handoffs; Fig. 6). In contrast, when a group is typically a geographic location where species of animals are counted, what species will be present is often unknown ahead of time. Yet, for our problems, we know every anesthesia provider who works at the hospital. In addition, if 1 or 2 of the anesthesia providers were handed a CPT manual, they likely could identify with only minimal error what procedures are performed at the hospital, and the few procedures that are the most common. However, if asked for the specific percentages of the cases with breaks that are accounted for by each of the 3 most common procedures, the estimates probably would be inaccurate (see Diversity Assessed with Weighting for Differences Among Procedures and Providers). Our interests in similarity are the quantitative relationships because operational activities, management monitoring, and hospital competition depend on the numbers.

Sensitivity Analyses to Interpret an Observed Value of the Similarity Index θ

Figure 7
Figure 7:
Results of Figure 5 presented weighted by number of anesthesia minutes rather than number of cases. See the legend of Figure 5 for explanation.

What matters for managerial decision making is the magnitude of similarity and investigation of outliers.26 As explained earlier, for comparisons of hospitals, omission of each of the 3 most common procedures produced the largest changes in the similarity index for 3 quarters of the 48 comparisons, and changing the least common procedures had no measurable effect.26 Applying this approach to the study, above, of similarity in the distributions among anesthesia providers of cases with breaks before or within 12 minutes of the start of the surgical procedure versus afterward, exclusion of the 3 anesthesia providers who received the least number of breaks resulted in the

, a negligible change of 0.0001. They were added back. Next, we considered the 3 anesthesia providers who received the largest number of breaks. They are the 3 (large) outliers to the right in Figure 6, shown using the dotted line square. Excluding each of the 3 in sequence from largest to third largest total number of cases with break resulted in

changing to


, and

, respectively. Excluding all 3 outlier anesthesia providers increased the similarity to

. Repeating the analysis of all 50 anesthesia providers, but using sums of anesthesia minutes for the cases with breaks instead of counts of cases with breaks,

(i.e., no difference from the sensitivity analysis). For our other analysis using procedures (Fig. 5),

. Three procedures accounted for the most cases: electroconvulsive therapy (0.2%, 5.5%), extracapsular cataract removal with insertion of intraocular lens prosthesis (0.6%, 4.6%), and follicle puncture for oocyte retrieval (0.1%, 3.5%). When each was excluded, the


, and

, respectively. When all 3 were excluded,

. Repeating the analysis with all procedures, but using sums of anesthesia minutes rather than cases,

. This is shown in Figure 7, but again is no different from the sensitivity analysis (i.e., we do not consider the analysis by count of cases to be a major limitation because sensitivity analyses are performed).


Evaluating Care at Handoffs

Potentially, an anesthesia provider’s quality of setup, documentation, etc., could be evaluated by another anesthesia provider who provides a break or handoff. However, anesthesia setup, documentation, etc., depend on the procedure. Furthermore, the incidence of breaks differs among procedures, whether analyzed by case (

, Fig. 5) or by hour (

, Fig. 7). Thus, when assessing the provider’s quality of setup, etc., the procedure needs to be included in the statistical analysis. As shown by Figures 1 and 3, what makes a comprehensive hospital comprehensive is the substantial diversity of procedures performed, especially among physiologically complex procedures.6,18 Consequently, at such hospitals, a sample size sufficient to estimate the anesthesia provider effect (i.e., the provider’s quality of setup, etc.) cannot be achieved while controlling for the procedure. These observations may be useful, because 2 institutions found that each increase in the number of handoffs was associated with greater morbidity.63,64


Estimates for Herfindahl Index


refer to the number of cases performed at the

facility during the observation period (i.e., the sample size). Let

specify the procedure of the

case at the


. Let

refer to the


, where

is the total number of procedures or combinations of procedures (e.g., from the corresponding dictionary). Finally, let the indicator

equal 1 if the value in the expression is true, and 0 otherwise. Then, the observed proportion of cases at the

facility that are of the

procedure is as follows:

In practice, data come in the form of

and the array (vector) operations of Equation 10 need to be performed.26

Using the first-order Taylor series expansion (i.e., Delta method),m the SE for the maximum likelihood estimate for

(Equation 2) equals6,7,65:

From Appendix B of the study by Taplin,8 a more accurate estimate from using higher order terms equals:



However, in the Herfindahl, Gini-Simpson Index, and their SEs, we show that, for our applications, the difference between SEs calculated using Equations 11 and 12 are in the fourth or fifth decimal places, respectively (i.e., negligible). Using the first-order Taylor series expansion,m the SE of the inverse of the Herfindahl is approximately equal to Equation 11 or 12 divided by


The minimum variance unbiased estimator for the Herfindahl equals5,65:

The corresponding estimate of the diversity of order 2

The measures of diversity of order

(i.e., “Hill numbers”) are given by5,66:

To obtain the measure of diversity of order

, start with the Shannon entropy of the procedures at the


The summation is from


(number of procedures observed at the

hospital and not to

(total number of procedures such as from a dictionary) to avoid the

. The diversity of order 1 is

A low bias estimator for

is given by Gotelli and Chao’s Equation 25b as follows5:

Estimate for the Similarity Index and its SE

Repeating Equation 8, from above,

The minimum possible value of

is obtained when the numerator equals 0, when no procedures are present at both facilities (i.e.,

). The maximum of

is obtained when

for all

. To understand why no greater value of

can be obtained, start with

. Expanding terms implies that the difference between the denominator and the numerator of Equation 8 is ≥0, implying that the numerator is no greater than the denominator.

From Comparisons of Similarity Indices, Based on Data from Within a Single Hospital, there were comparisons of




. However, one should not draw the conclusion that the relationships among


, and

are necessarily


. Such does indeed hold for


. However, consider


. Then,



(Equation 9) is asymptotically normally distributed.7 Let,6,7


These are the nonparametric maximum likelihood estimators for the Herfindahl index of the 1st and 2nd hospitals. In addition, let

Then, the nonparametric maximum likelihood estimator for


. The estimate of its SE is the square root of its asymptotic variance6,7:


Chao Et al.’s Correction of n for Rare Procedures (that is, Species)

We rewrite Equation 6 as follows:

By using the notation from Chao et al.,62 let

“be the observed number of shared species” (e.g., procedures) “that occur” precisely “once” in the first group. “These species must be present in the” second group, “but may have any frequency.”62 “Let

be the observed number of shared species that occur twice in” the first group.62 Similarly, “define


to be the observed number of shared species that occur, respectively, once” and twice in the second group.62 “When


,” replace “




, respectively.”62 Doing so was necessary for our example with 50 anesthesia providers, because the minimum number of cases among the 50 anesthesia providers was 8 for the first group and 13 for the second (i.e.,

). Then, substitute the following into Equation 22 as follows62:


If the value of


is > 1, which can happen, set it equal to 1.62

Yue and Clayton’s Unbiased Estimator of Similarity (Equation 8)

Yue and Clayton’s7 nonparametric maximum likelihood estimator

in Equation 9 is consistent but biased. The (small) bias arises because the estimate in Equation 10,

is a biased estimate of

. The unbiased sample variance is obtained by multiplying by

.n Yu and Clayton7 suggest the unbiased estimate:

The same substitution is made for

. The adjusted estimates are used in Equation 8. However, Yu and Clayton7 do not provide an analytical expression for the SE of the resulting adjusted estimate of

. For our data, the unbiased estimator of similarity is greater than the corresponding nonparametric maximum likelihood estimator of

. The


are much > 1. Thus, the denominators of


are effectively the same. However, the numerator is smaller by subtracting the 1. Because


are smaller and appear in the denominator of Equation 9 for

, the consequence is that the unbiased estimator is greater. For further consideration, see the corresponding section in the article: Comparisons of Similarity Indices, Based on Data from Within a Single Hospital.


Name: Franklin Dexter, MD, PhD.

Contribution: This author helped design the study, conduct the study, analyze the data, and write the manuscript.

Attestation: Franklin Dexter has approved the final manuscript.

Name: Johannes Ledolter, PhD.

Contribution: This author helped analyze the data and prepare the manuscript.

Attestation: Johannes Ledolter has approved the final manuscript.

Name: Bradley J. Hindman, MD.

Contribution: This author helped conduct the study and write the manuscript.

Attestation: Bradley J. Hindman has approved the final manuscript.


Dr. Franklin Dexter is the Statistical Editor and Section Editor for Economics, Education, and Policy for Anesthesia & Analgesia. This manuscript was handled by Dr. Steven L. Shafer, Editor-in-Chief, and Dr. Dexter was not involved in any way with the editorial process or decision.


The authors thank Jennifer Espy, BFA, who assisted in editing, and David Griffiths, BS, who assisted in computer programming.


a Accessed May 24, 2015.
Cited Here

b The data are more complicated, because counts of procedures at each hospital are what were available, each procedure being of a specified type of procedure.6 Some surgical cases include more than 1 procedure (e.g., myringotomy tube insertion bilaterally is 2 procedures both of the same type). For simplicity of wording in the current article, we refer to “procedure” as the type of procedure and to the count of procedures as “cases.” The sole implication for the current article is that the listed sample sizes for Figure 1 exceed the number of cases. This has no influence on the current article other than briefer wording and similarity with the new applications presented.
Cited Here

c Throughout this article, at facilities j = 4 and j = 5, the phrase “anesthesia providers” refers to Certified Registered Anesthesia providers. At the facilities, the anesthesia providers bill independently. However, for all cases, a faculty anesthesiologist provides clinical oversight directed toward assuring the quality of clinical care.9–13 In contrast, for Figure 2, an anesthesia provider could be any type including anesthesia resident or anesthesiologist.1,2
Cited Here

d The facilities in this article overlap. For example, the j = 5th hospital’s data includes that of the ambulatory surgery center, j = 4. We use the numbering for convenience so that we do not need to introduce the sample sizes and population repeatedly. The numbering is essentially that of sequential examples.
Cited Here

e The data are from December 1, 2013, through January 24, 2015. The 7 eight-week periods are from December 29, 2013, through January 24, 2015. The American Society of Anesthesiologists’ 2014 Crosswalk has base units for 5426 different CPT.
Cited Here

f Throughout the article, when we refer to anesthesia providers, our counts refer to cases started by an anesthesia provider, and, of course, usually finished by an anesthesia provider, but our count starts. The classification by anesthesia providers is by the anesthesia provider starting the case.
Cited Here

g The estimated evenness factor equals 79.3%, where 0.793 equals the 39.67 divided by the population size of S = 50 anesthesia providers.19 The estimation of evenness is reliable for that calculation because there are no very rare “species” (i.e., no anesthesia provider performing only a few cases). The estimate of evenness is unreliable (i.e., highly sensitive to)27 for counts of procedures, because many procedures and combinations of procedures are rare, as shown in Figure 2.
Cited Here

h The procedures’ anesthesia times (mean ± SE) were 19.9 ± 0.2 minutes for electroconvulsive therapy (n = 576), 37.6 ± 0.8 minutes for cataract extraction (n = 510), and 38.8 ± 0.5 minutes for oocyte retrieval (n = 360). However, it was not that the brevity accounted for the differences in the Herfindahl indices between groups, because the analysis used only the counts of cases. Rather, there were many such cases of the procedures because they were brief. Other procedures were brief but were performed less often: ventilating tube removal (69,424) 24.6 ± 2.1 minutes (n = 13), removal of nonbiodegradable drug delivery implant (11,983) 30.2 ± 2.1 minutes (n = 10), and manipulation of knee joint under general anesthesia (27,570) 37.5 ± 2.4 minutes (n = 17).
Cited Here

i,, and Accessed June 6, 2015.
Cited Here

j Accessed June 6, 2015.
Cited Here

For readers intrigued by the issue of breaks, among cases with at least 1 break, the similarity of procedures was moderately large26,30 (
Cited Here

Cited Here

) between cases with the first (or only) break occurring before or within 12 minutes of the start of the surgical procedure.a The anesthetic durations were 2.76 ± 0.04 hours among cases with the break early in the case versus 2.81 ± 0.03 hours among cases with the break >12 minutes61 after surgery began. The area under the receiver operating characteristic curve was only 0.51 (P = 0.19 by Wilcoxon Mann-Whitney U test). Thus, it was not that breaks were more commonly given early in the case among briefer procedures.
Cited Here

l By using anesthesia information management systems data, we evaluated the times of entries of comments, drugs, fluids, and periodic assessments (e.g., electrocardiogram diagnosis and train-of-four).61 Essentially, we counted the timing of mouse clicks. More than half of ongoing cases had completed initial documentation by 13 minutes.
Cited Here

m Accessed May 13, 2015.
Cited Here

n Accessed April 25, 2015.
Cited Here


1. Dexter F, Traub RD, Fleisher LA, Rock P. What sample sizes are required for pooling surgical case durations among facilities to decrease the incidence of procedures with little historical data? Anesthesiology. 2002;96:1230–6
2. Dexter F, Macario A. What is the relative frequency of uncommon ambulatory surgery procedures performed in the United States with an anesthesia provider? Anesth Analg. 2000;90:1343–7
3. Dexter F, Ledolter J, Davis E, Witkowski TA, Herman JH, Epstein RH. Systematic criteria for type and screen based on procedure’s probability of erythrocyte transfusion. Anesthesiology. 2012;116:768–78
4. Epstein RH, Dexter F. Management implications for the perioperative surgical home related to inpatient case cancellations and add-on case scheduling on the day of surgery. Anesth Analg. 2015;121:206–18
5. Gotelli NJ, Chao ALevin SA. Measuring and estimating species richness, species diversity, and biotic similarity from sampling data. In: Encyclopedia of Biodiversity. 20132nd ed Waltham, MA Academic Press:195–211
6. Dexter F, Wachtel RE, Yue JC. Use of discharge abstract databases to differentiate among pediatric hospitals based on operative procedures: surgery in infants and young children in the state of Iowa. Anesthesiology. 2003;99:480–7
7. Yue JC, Clayton MK. A similarity measure based on species proportions. Comm Stat Theor Meth. 2005;34:2123–31
8. Taplin RH. Harmony, statistical inference with the Herfindahl H index and C index. Abacus. 2003;39:82–94
9. Dexter F, Logvinov II, Brull SJ. Anesthesiology residents’ and nurse anesthetists’ perceptions of effective clinical faculty supervision by anesthesiologists. Anesth Analg. 2013;116:1352–5
10. Dexter F, Ledolter J, Smith TC, Griffiths D, Hindman BJ. Influence of provider type (nurse anesthetist or resident physician), staff assignments, and other covariates on daily evaluations of anesthesiologists’ quality of supervision. Anesth Analg. 2014;119:670–8
11. Dexter F, Ledolter J, Hindman BJ. Bernoulli Cumulative Sum (CUSUM) control charts for monitoring of anesthesiologists’ performance in supervising anesthesia residents and nurse anesthetists. Anesth Analg. 2014;119:679–85
12. Dexter F, Masursky D, Hindman BJ. Reliability and validity of the anesthesiologist supervision instrument when certified registered nurse anesthetists provide scores. Anesth Analg. 2015;120:214–9
13. Dexter F, Hindman BJ. Quality of supervision as an independent contributor to an anesthesiologist’s individual clinical value. Anesth Analg. 2015;121:507–13
14. Smallman B, Dexter F. Optimizing the arrival, waiting, and NPO times of children on the day of pediatric endoscopy procedures. Anesth Analg. 2010;110:879–87
15. McLemore T, Lawrence L. Plan and operation of the National Ambulatory Survey of Ambulatory Surgery. Vital and Health Statistics. 1997;Series 1, no. 37 Hyattsville, MD U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics:109–10
16. Dexter F, Dexter EU, Ledolter J. Influence of procedure classification on process variability and parameter uncertainty of surgical case durations. Anesth Analg. 2010;110:1155–63
17. Owings MF, Kozak LJ. Ambulatory and inpatient procedures in the United States, 1996. Vital and Health Statistics. 1998;Series 13, No. 132 Hyattsville, MD U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics:33–113
18. Kanter RK, Dexter F. Criteria for identification of comprehensive pediatric hospitals and referral regions. J Pediatr. 2005;146:26–9
19. Gosselin F. An assessment of the dependence of evenness indices on species richness. J Theor Biol. 2006;242:591–7
20. Jost L. Entropy and diversity. Oikos. 2006;113:363–74
21. Keylock CJ. Simpson diversity and the Shannon-Wiener index as special cases of a generalized entropy. Oikos. 2005;109:203–7
22. Basharin GP. On a statistical estimate for the entropy of a sequence of independent random variables. Theor Probab Appl. 1959;4:333–6
23. Lande R. Statistics and partitioning of species diversity, and similarity among multiple communities. Oikos. 1996;76:5–13
24. Chao A, Shen TJ. Nonparametric estimation of Shannon’s index of diversity when there are unseen species in sample. Environ Ecol Stat. 2003;10:429–43
25. Wachtel RE, Dexter F, Lubarsky DA. Financial implications of a hospital’s specialization in rare physiologically complex surgical procedures. Anesthesiology. 2005;103:161–7
26. Wachtel RE, Dexter F, Barry B, Applegeet C. Use of state discharge abstract data to identify hospitals performing similar types of operative procedures. Anesth Analg. 2010;110:1146–54
27. Jost L. The relation between evenness and diversity. Diversity. 2010;2:207–32
28. Hoffmann S, Hoffmann A. Is there a “true” diversity? Ecol Econ. 2008;65:213–5
29. Botta-Dukát Z. Rao’s quadratic entropy as a measure of functional diversity based on multiple traits. J Veg Sci. 2005;16:533–40
30. Wachtel RE, Dexter EU, Dexter F. Application of a similarity index to state discharge abstract data to identify opportunities for growth of surgical and anesthesia practices. Anesth Analg. 2007;104:1157–70
31. O’Neill L, Dexter F. Tactical increases in operating room block time based on financial data and market growth estimates from data envelopment analysis. Anesth Analg. 2007;104:355–68
32. O’Neill L, Dexter F, Wachtel RE. Should anesthesia groups advocate funding of clinics and scheduling systems to increase operating room workload? Anesthesiology. 2009;111:1016–24
33. Dexter F, Thompson E. Relative value guide basic units in operating room scheduling to ensure compliance with anesthesia group policies for surgical procedures performed at each anesthetizing location. AANA J. 2001;69:120–3
34. Dexter F, Macario A, Penning DH, Chung P. Development of an appropriate list of surgical procedures of a specified maximum anesthetic complexity to be performed at a new ambulatory surgery facility. Anesth Analg. 2002;95:78–82
35. Wachtel RE, Dexter F. Differentiating among hospitals performing physiologically complex operative procedures in the elderly. Anesthesiology. 2004;100:1552–61
36. Scurlock C, Dexter F, Reich DL, Galati M. Needs assessment for business strategies of anesthesiology groups’ practices. Anesth Analg. 2011;113:170–4
37. Zhou J, Dexter F, Macario A, Lubarsky DA. Relying solely on historical surgical times to estimate accurately future surgical times is unlikely to reduce the average length of time cases finish late. J Clin Anesth. 1999;11:601–5
38. Macario A, Dexter F. Estimating the duration of a case when the surgeon has not recently scheduled the procedure at the surgical suite. Anesth Analg. 1999;89:1241–5
39. Dexter F, Ledolter J. Bayesian prediction bounds and comparisons of operating room times even for procedures with few or no historic data. Anesthesiology. 2005;103:1259–167
40. Dexter F, Yue JC, Dow AJ. Predicting anesthesia times for diagnostic and interventional radiological procedures. Anesth Analg. 2006;102:1491–500
41. Dexter F, Macario A, Ledolter J. Identification of systematic underestimation (bias) of case durations during case scheduling would not markedly reduce overutilized operating room time. J Clin Anesth. 2007;19:198–203
42. Dexter F, Epstein RH, Lee JD, Ledolter J. Automatic updating of times remaining in surgical cases using Bayesian analysis of historical case duration data and “instant messaging” updates from anesthesia providers. Anesth Analg. 2009;108:929–40
43. Dexter EU, Dexter F, Masursky D, Kasprowicz KA. Prospective trial of thoracic and spine surgeons’ updating of their estimated case durations at the start of cases. Anesth Analg. 2010;110:1164–8
44. Tiwari V, Dexter F, Rothman BS, Ehrenfeld JM, Epstein RH. Explanation for the near-constant mean time remaining in surgical cases exceeding their estimated duration, necessary for appropriate display on electronic white boards. Anesth Analg. 2013;117:487–93
45. Dexter F, Ledolter J, Tiwari V, Epstein RH. Value of a scheduled duration quantified in terms of equivalent numbers of historical cases. Anesth Analg. 2013;117:204–9
46. Dexter F, Epstein RH, Bayman EO, Ledolter J. Estimating surgical case durations and making comparisons among facilities: identifying facilities with lower anesthesia professional fees. Anesth Analg. 2013;116:1103–15
47. Zhou J, Dexter F. Method to assist in the scheduling of add-on surgical cases—upper prediction bounds for surgical case durations based on the log-normal distribution. Anesthesiology. 1998;89:1228–32
48. Dexter F, Traub RD, Qian F. Comparison of statistical methods to predict the time to complete a series of surgical cases. J Clin Monit Comput. 1999;15:45–51
49. Dexter F, Epstein RH, Traub RD, Xiao Y. Making management decisions on the day of surgery based on operating room efficiency and patient waiting times. Anesthesiology. 2004;101:1444–53
50. Wachtel RE, Dexter F. Reducing tardiness from scheduled start times by making adjustments to the operating room schedule. Anesth Analg. 2009;108:1902–9
51. Dexter F, Epstein RH, Lee JD, Ledolter J. Automatic updating of times remaining in surgical cases using Bayesian analysis of historical case duration data and “instant messaging” updates from anesthesia providers. Anesth Analg. 2009;108:929–40
52. Tiwari V, Dexter F, Rothman BS, Ehrenfeld JM, Epstein RH. Explanation for the near-constant mean time remaining in surgical cases exceeding their estimated duration, necessary for appropriate display on electronic white boards. Anesth Analg. 2013;117:487–93
53. Strum DP, Sampson AR, May JH, Vargas LG. Surgeon and type of anesthesia predict variability in surgical procedure times. Anesthesiology. 2000;92:1454–66
54. Eijkemans MJ, van Houdenhoven M, Nguyen T, Boersma E, Steyerberg EW, Kazemier G. Predicting the unpredictable: a new prediction model for operating room times using individual characteristics and the surgeon’s estimate. Anesthesiology. 2010;112:41–9
55. Shukla RK, Ketcham JS, Ozcan YA. Comparison of subjective versus data base approaches for improving efficiency of operating room scheduling. Health Serv Manage Res. 1990;3:74–81
56. Wright IH, Kooperberg C, Bonar BA, Bashein G. Statistical modeling to predict elective surgery time. Comparison with a computer scheduling system and surgeon-provided estimates. Anesthesiology. 1996;85:1235–45
57. Dexter F, Dexter EU, Masursky D, Nussmeier NA. Systematic review of general thoracic surgery articles to identify predictors of operating room case durations. Anesth Analg. 2008;106:1232–41
58. Kayiş E, Khaniyev TT, Suermondt J, Sylvester K. A robust estimation model for surgery durations with temporal, operational, and surgery team effects. Health Care Manag Sci. 2015;18:222–33
59. Yue JC, Clayton MK, Lin FC. A nonparametric estimator of species overlap. Biometrics. 2001;57:743–9
60. Dexter F, Wachtel RE, Sohn MW, Ledolter J, Dexter EU, Macario A. Quantifying effect of a hospital’s caseload for a surgical specialty on that of another hospital using multi-attribute market segments. Health Care Manag Sci. 2005;8:121–31
61. Epstein RH, Dexter F. Mediated interruptions of anaesthesia providers using predictions of workload from anaesthesia information management system data. Anaesth Intensive Care. 2012;40:803–12
62. Chao A, Chazdon RL, Colwell RK, Shen TJ. Abundance-based similarity indices and their estimation when there are unseen species in samples. Biometrics. 2006;62:361–71
63. Saager L, Hesler BD, You J, Turan A, Mascha EJ, Sessler DI, Kurz A. Intraoperative transitions of anesthesia care and postoperative adverse outcomes. Anesthesiology. 2014;121:695–706
64. Hyder JA, Bohman JK, Kor DJ, Subramanian A, Bittner EA, Narr BJ, Cima RR, Montori VM. Anesthesia care transitions and risk of postoperative complications. Anesth Analg. 2016:134–44
65. Simpson EH. Measurement of diversity. Nature. 1949;163:688
66. Hill MO. Diversity and evenness: a unifying notation and its consequences. Ecology. 1973;54:427–31
© 2016 International Anesthesia Research Society