*e.g.*, whether the patient responds to command, whether the patient can maintain adequate spontaneous ventilation, whether the patient shows a hemodynamic or somatic response to surgical stimulus). In this situation, a common technique of data analysis is logistic regression, in which the probability (

*P*) of drug effect is evaluated as a function of the drug concentration in plasma or at the effect site (C), using the following equation:MATH 1 where C

_{50}is the concentration at which the probability of drug effect is 50% and γ is a measure of the steepness of the concentration-effect curve.‡ Values of C

_{50}and γ are estimated by expressing the logarithm of the likelihood of the observed results (

*i.e.,*log likelihood) using and then maximizing this with respect to C

_{50}and γ. Logistic regression has been used for the analysis of the pharmacodynamics of inhaled and intravenous anesthetics,

^{1–11}usually with a primary focus of determining values of C

_{50}and γ. However, any statistical technique should not only provide parameter estimates, but also measures of the accuracy of these estimates. Although typical statistical software programs provide confidence intervals for estimated parameters, because logistic regression is a nonlinear technique, these confidence intervals are only valid asymptotically as the number of data points (N) goes to infinity.

^{12}In reality, most studies in the anesthesia literature involve relatively small numbers of data points. Therefore, it is often unclear how reliable parameter estimates are. The purpose of this study was to use simulations to analyze the relation between sample size and the accuracy of parameter estimates.

#### Methods

Equation 2 Image Tools |
Equation 3 Image Tools |

^{12}This model assumes that anesthetic effect is measured by an underlying continuous variable, denoted y, which is related to the drug concentration by the following equation:MATH 2 where x is a random variable that is assumed to have a logistic distribution. The logistic distribution is described by MATH 3

*P*(x) is the probability of a given value of x. This distribution has a mean of zero and a variance of π

^{2}/3). The model is applicable to binary yes-or-no data because we assume that a positive drug effect is observed only if y > 0 or equivalently if x > −(γ ln C − γ ln C

_{50}).

_{50}; the adjective “deterministic” refers to the fact that this part of the effect is determined solely by the drug concentration and the pharmacodynamic parameters C

_{50}and γ. This increases steadily as drug concentration (C) increases. The solid circles are random numbers conforming to a normal distribution; we used a normally distributed variable for convenience because of the ready availability of normally distributed random-number generators in standard statistical software packages. When the sum of the deterministic component (solid diamonds) and the random variables (solid circles) is greater than or equal to zero, the observed drug effect (solid triangles) is one, and if the sum is less than zero, the observed drug effect is zero.

_{50}= 100 (a valid assumption because the units of concentration are arbitrary). To begin each simulation, n data points were generated by randomly selecting n drug concentrations distributed uniformly on a logarithmic scale from 25 to 400 units using the random number generator of an Excel (Microsoft, Redmond, WA) spreadsheet. In a human study, this corresponds to the investigator assigning a drug dose to each patient enrolled in the study. At this point, if the concentration effect is totally deterministic (

*i.e.,*if γ is infinite), then a positive drug effect will be observed if the drug concentration (C) assigned to the data point exceeded C

_{50}. However, as noted previously, there is an element of randomness in the concentration–effect relation, embodied in the fact that γ is finite. To take into account this randomness, a uniformly distributed random variable from 0 to 1 was generated for each “patient,” again using the random-number generator of an Excel spreadsheet. If this number was less than C

^{γg}/(C

_{50}

^{γg}+ C

^{γg}) the simulated patient was assumed to have a positive drug effect (the response variable [R] was given a value of 1). Otherwise, it was assumed that R = 0. In this manner, responses are obtained for a range of concentrations, and spreadsheets consisting of columns of C (the drug concentration) and R (the response variable) are generated. For each simulation, the parameters (C

_{50}, γ) were estimated by maximum likelihood estimation. The logarithm of the likelihood of observed results (LL) was maximized as a function of C

_{50}and γMATH 5 (P

_{i}is given by for each value of C; R = 1 if a drug effect is observed and R = 0 otherwise; and i indexes data points). is the sum of the logarithms of the probabilities of independent outcomes, which is P

_{i}when R

_{i}= 1 and 1 − P

_{i}when R

_{i}= 0. The estimation procedure was implemented on the Excel spreadsheet, taking advantage of its Solver function. In some instances, the attempted maximization of the log likelihood failed with the Solver function returning a “#NUM” message, indicating it could not interpret the log likelihood as a number. In these cases, the simulated data set was discarded and a new set generated.

_{50}= 100 (units are arbitrary) for every data point. Thus, in these simulations, interpatient variability was ignored by assuming that the data were derived from a single patient. Simulations were conducted for n = 10, 20, 30, 40, 50, 75, and 100 data points. Because any nonlinear regression technique necessitates the stipulation of initial parameter values (

*i.e.,*an initial guess of what the true value is), we repeated each simulation for C

_{50}initial = 40, 100 and 250, and γ initial = 0.6, 1.0, or 1.4 times the true value of γ. Each simulation was repeated 100 times, and the standard deviations of the estimates of C

_{50}and γ were calculated. We did not calculate confidence intervals, but we determined, for each simulation, whether the confidence intervals would include the true values of C

_{50}and γ. We did this by calculating the change in the logarithm of the likelihood, multiplied by a factor of 2 (2 × δLL), which occurred when the true value of either C

_{50}or γ was substituted for the value determined by minimization of LL. This parameter (2 × δLL) has a chi-square distribution with one degree of freedom, which enables the researcher to determine whether any specific value of either C

_{50}or γ is within the 95% confidence intervals.

^{12}It should be noted that the confidence intervals for this type of data often are markedly asymmetric, and the parameter estimates do not follow a t distribution.

_{50}and γ values had log-normal distributions (

*i.e.*,

*P*= P

_{TV}exp(η), where P denotes the parameter (C

_{50}, γ), TV denotes the typical value, and η has a normal distribution with a mean value of zero and an SD of 0.3. We assumed that C

_{50}TV = 100, and we considered γ

_{TV}values of 1.0, 1.5, 3.0, 4.5, and 6.0. Simulations were again conducted for n = 10, 20, 30, 40, 50, 75, and 100 patients and for each value of simulations were repeated 100 times for starting values of C

_{50}initial = 40, 100 and 250, and γ initial = 0.6, 1.0, or 1.4 times the typical value of γ. We also determined whether C

_{C50}TV and γ

_{TV}were within the 95% confidence intervals, as described previously.

##### Statistical Analysis

*i.e.,*the SD of the parameter estimates normalized to the mean value of parameter estimates (C

_{50}SD/mean or γ SD/mean) and estimate confidence, which we define as the percentage of parameter estimates for which the 95% confidence intervals include the “true” value (%CI

_{95}).

#### Results

*i.e.,*the difference between the mean parameter estimate and the true value) is presented in table 1 for simulations in which the initial parameter values were the true values. The bias of C

_{50}estimates generally was small (< 10%), unless n = 10 and γ was small (≤ 2), in which cases the bias was nearly 100%. The bias in the estimation of γ was less than 20% if the number of simulated data points (n) was more than 10, with the notable exception of simulations in which the parameters had log-normal distributions (see Discussion).

Fig. 2 Image Tools |
Fig. 3 Image Tools |

_{50}and γ estimates as a function of sample size when we simulate data from a single patient (fixed C

_{50}and γ). In these figures, the initial “guesses” of the parameters are the true values. As expected, the coefficient of variation decreases as the number of patients (n) increases. Additionally, the variability in the estimate of C

_{50}increases as γ decreases (fig. 2). For γ = 1.0, the coefficient of variation of the C

_{50}estimate is substantial (> 50%) unless the number of patients in the simulated study (n) exceeds 30. However, if γ is larger (> 6), the coefficient of variation of the C

_{50}estimate is approximately 10% if n ≥ 20. Equally important, the true value of C

_{50}lies within the 95% confidence intervals of the estimates in more than 90% of simulations. In contrast to C

_{50}estimates, the magnitude of the true value of γ has less influence on the variability of the estimates of γ, as is shown in figure 3. The coefficient of variation of γ estimates is somewhat greater than that of C

_{50}estimates and is approximately 60% for n = 20, decreasing only to approximately 40% for n = 100. However, the confidence of γ estimates is similar to that of C

_{50}estimates, with %CI

_{95}values well in excess of 90% in these simulations with no interpatient variability.

Fig. 4 Image Tools |
Fig. 5 Image Tools |

_{50}and γ are different for different simulated patients. We assumed that the parameters have log-normal distributions. These simulations mirror the common situation in which data from multiple patients are pooled for analysis. Again, we present simulations in which the initial guesses of C

_{50}and γ are equal to C

_{50}TV and γ

_{TV}. As in the case with no interpatient variability, the coefficient of variation of estimates of either C

_{50}or γ decrease as n increases. The magnitude of this variability is similar to the previous simulations, in which C

_{50}and γ were fixed. As before, the coefficient of variation of C

_{50}estimates decreases as γ increases. For C

_{50}, the confidence of the estimates are excellent, with the true value (100) within the 95% confidence intervals of the estimates in more than 90% of the simulations. However, it can be seen in figure 5 that the fraction of simulations in which the true value of γ (γ

_{TV}) was within the 95% confidence intervals of the estimate decreased to as low as 60% as n increased for values of γ more than or equal to 4.5.

_{50}and γ other than the true values, which were used for the simulations illustrated in figures 2–5. The results were similar to those described previously, with the exception that, for n = 10 or n = 20, use of an overestimate as the starting value of C

_{50}resulted in a markedly larger coefficient of variation in the C

_{50}estimate for small values of γ (≤ 3). Despite this increase in variability, %CI

_{95}remained more than 90%.

#### Discussion

_{50}, the concentration of drug which results in 50% of maximal effect, and γ, a parameter that reflects the steepness of the curve. For many applications in anesthesiology research, the response is a binary all-or-none variable, and the maximal drug response or effect is unity. In this situation, estimation of C

_{50}and γ is usually referred to as logistic regression. Techniques of logistic regression are based on the principle of maximum likelihood and are inherently nonlinear. Any method of statistical analysis should provide parameter estimates and confidence intervals. In contrast to linear regression, the confidence intervals provided by most software packages for nonlinear regression are approximations that are only accurate asymptotically as the number of data points increases toward infinity. Because many applications of logistic regression in anesthesiology research involve 20–50 data points, the accuracy of C

_{50}and γ estimates are unclear. In this study, we used simulations (a well-known technique for analyzing statistical methodology, also known as Monte Carlo simulation) to investigate the relation between study population size and the accuracy of parameter estimates.

_{95}). The coefficient of variation is the SD of the parameter estimate expressed as a percentage of the mean estimate. We did not graphically present bias (

*i.e.,*the difference between the mean estimate and the true value) because, in general, bias was small with the mean C

_{50}estimate within less than 10% of the true value and the mean estimate of γ within less than 20% of the true value. However, for simulations of small (n = 10) studies, we found a much larger bias for both parameters, although %CI

_{95}was still more than 90% when we simulated data from single patients. When we simulated pooled data from multiple patients (with log-normal distributions for C

_{50}and γ), there was a larger bias in γ estimates (up to 30%), even when n was large and %CI

_{95}was significantly smaller.

Fig. 6 Image Tools |
Fig. 7 Image Tools |
Fig. 8 Image Tools |

*i.e.,*the coefficient of variation decreases) as the number of patients in the simulated study (n) increases. This is evident in figures 2–5. We did not anticipate that the variability of C

_{50}estimation would depend on the value of γ, improving as γ, the measure of the steepness of the concentration–response curve, increases. However, the basis for this observation is more evident with consideration of specific simulated data sets. Figure 6 illustrates a simulation of 10 data points (from the same patient, so that interpatient variability is not an issue) with γ = 1. One could surmise that there is little information about either C

_{50}or γ in this data. Figure 7 presents a larger data set (n = 100) for γ = 1. Comparing the two figures, one can see how C

_{50}may be estimated more reliably when n is larger. Figure 8 presents simulated data for n = 100 and γ = 6. It is now clear that C

_{50}and γ both are estimated more reliably in this situation.

^{12}We investigated the accuracy of these asymptotic confidence intervals by determining the frequency with which the 95% confidence intervals contained the true value of the parameter. We used the change in log likelihood that occurred when the true parameter value was substituted for the maximum likelihood estimate as a means of assessing this frequency. When we simulated data from a single patient, we found that the 95% confidence intervals included the true parameter value in more than 90% of the simulations, even for small n (n = 10) and small γ, when the coefficient of variation of parameter estimates was large. This suggests that, even when there is large estimate variability (

*i.e.*, even when the point estimate of a parameter may be significantly in error), the 95% confidence intervals will be large enough to include the true value, if interpatient variation is not an issue. When we considered interpatient variation and simulated pooled data from multiple patients, we again found that the 95% confidence intervals included the true value of C

_{50}in more than 90% of the simulations. Thus, analyzing data from a population of patients in a naive fashion does not appear to compromise the accuracy of C

_{50}estimates. However, the 95% confidence intervals of γ estimates were not as reliable. For larger γ (γ ≥ 4.5), the fraction of simulations in which the 95% confidence intervals included the true value actually decreased to less than 90% as n increased. We believe this stems from the inability to distinguish between intrapatient and interpatient variability when data from multiple patients are pooled for analysis in a naive fashion (

*i.e.*, without explicitly accounting for interpatient variability). In one sense, the parameter γ is a measure of intrapatient variability. If γ is large, the concentration–response curve is steep. If the concentration of drug (C) is slightly larger than C

_{50}, the probability of drug response is close to one, and if C is slightly less than C

_{50}, the probability of drug response is close to zero. The concentration range over which the probability of drug effect is intermediate, where there is significant intrapatient variability, is narrow. In contrast, if γ is small, the concentration–response curve is relatively flat, and the probability of drug response takes on intermediate values over a wider concentration range (

*i.e.*, the region of significant intrapatient variability is larger). In figure 9, we illustrate how interpatient C

_{50}variability may be indistinguishable from intrapatient variability. This figure presents concentration–response curves for nine hypothetical patients, each with γ = 10, but with varying C

_{50}values. If one data point is collected from each patient and pooled for analysis, the resulting curve (dashed line) appears to have a much lower value of γ. We believe this is the basis for the failure of the 95% confidence intervals to include the true value of γ when we simulated data from multiple patients with log-normal distributions for C

_{50}and γ. This is consistent with our observation that the bias in γ estimation was increased in this case and that the mean γ estimate was less than the true value. It is somewhat surprising that the confidence of the estimate of γ decreased with increasing n; however, it should be noted that the confidence intervals are derived from the change in log likelihood, which results from substitution of other parameter values for the maximum likelihood estimates. The width of the confidence intervals decrease as n increases. Conversely, the bias in estimation of γ will not improve as n increases, because the steepness of the apparent concentration–effect curve will reflect the range of C

_{50}values; this is evident from figure 9. Consequently, as the confidence intervals become more narrow, the chance that they will include the true value of γ decreases.

_{50}in small studies (≤ 20) may be as high as 30–40%. It should be emphasized that estimates of either C

_{50}or γ are statistics. The distributions of these statistics are unknown; thus, the exact meaning of the coefficient of variation is unclear. Nevertheless, it seems reasonable to assume that point estimates of C

_{50}may be in error by 30–40% in a large number of studies of this size. The coefficient of variation of γ is larger, indicating even greater error in point estimates of γ.

_{50}estimates contain the true value of C

_{50}in most of the simulations (> 90%), even in simulations with larger coefficients of variation and even when we simulated naively pooling data from multiple patients. Confidence intervals for C

_{50}, which are based on asymptotic statistical theory, appear to be reliable. Given the significant coefficient of variation that may occur in some studies, indicating the possibility of error in point estimates of C

_{50}, it is clear that confidence intervals should always be reported.