Secondary Logo

Journal Logo

Anesthetic Pharmacology: Research Report

A General Purpose Pharmacokinetic Model for Propofol

Eleveld, Douglas J. PhD*; Proost, Johannes H. PhD*; Cortínez, Luis I. MD; Absalom, Anthony R. MD*; Struys, Michel M. R. F. MD*‡

Author Information
doi: 10.1213/ANE.0000000000000165

Pharmacokinetic (PK) models are used primarily in 2 ways. First, they are used by PK simulation software to estimate the plasma drug concentration profile resulting from a given drug administration regimen. An example of this is intraoperative advisory displays that provide information about interactions between hypnotics and opioids.1 Second, they are used in target-controlled infusion (TCI) systems to calculate the drug infusion rates required to achieve and maintain desired plasma concentrations. The choice of an appropriate PK model is essential to ensure sufficiently accurate drug concentrations predictions and appropriate drug infusion rates.

To rationally select the best model available for a specific patient, one needs an understanding of the complex clinical, PK, and pharmacometric issues underlying the derivation of these models. Factors such as anesthetic technique, age range, concomitant medication, body composition (obesity), and clinical status (patients versus healthy volunteers) may guide model selection. An exact match is often not possible, and different models may give conflicting predictions of patient characteristics. There is uncertainty of the accuracy of the models under differing patient and clinical conditions, and caution should be applied when extrapolating a model to a population different to that from which it was developed. Models developed for adults do not perform well when used with children,2,3 and more subtle aspects such as demographic balance between adults and elderly,4 obese and non-obese5 in model development should also be considered. Applications for very old or critically ill patients present additional challenges. Overall, this is difficult material, even for experts in the field.

A general purpose PK model with good predictive performance for a wide range of patient groups and clinical conditions would offer significant advantages. It would simplify the decision process of which model to use, improve the robustness of TCI systems and anesthesia drug displays, improve patient safety, and promote confidence among clinical users. Ideally, such a general purpose PK model should also show equivalent predictive accuracy compared with models developed specifically for particular patient groups and anesthetic techniques.

One method to develop a general purpose PK model is to combine PK data from multiple studies involving diverse groups of patients and clinical conditions and to then estimate a single PK model from the aggregated dataset. By including a wide range of patients, this approach may provide better covariate identification than studies focused on a particular group.6 This approach has been applied to a limited extent already, in studies that have produced propofol PK models for children and adults,7 neonates through to adults,8 for both obese and nonobese adults,9,10 and in studies testing allometric scaling in rats, children, and adults.11 Consideration of ontogeny12 and allometric scaling or lean body mass13 are likely necessary to allow a single model to span the range from young children to elderly to obese.

Although some studies have used selected propofol PK datasets from the Open TCI Initiative Web site (, no studies have used all the data freely available there to develop a general purpose PK model. The goal of this study was to determine a single PK model for propofol with robust predictive performance suitable for general purpose applications in a wide range of patients under diverse intraoperative clinical conditions.


We used data available on the Open TCI Web site with age and weight covariates and further expanded this with data (supplied by the authors) from studies in young children,14 children,2,3 obese adults,9 and adults.15–17 In total, the analyzed dataset was an aggregate of 21 component datasets. All datasets have been published in scientific journals, and all studies obtained necessary ethical committee approval, as declared in the original papers. Drug concentration observations were assumed to be concentrations in plasma. Table 1 provides some summarized details of the component datasets.

Table 1
Table 1:
Details of the Component Datasets

We modeled the time course of the propofol plasma concentration using a 3-compartmental PK model with volumes V1, V2, V3, elimination clearance CL, and intercompartmental clearances Q2 and Q3. Model parameters were assumed to be log-normally distributed across the population. Initially, a proportional plus additive error model was used with the proportional error variance estimated separately for each component dataset. Later in the model-building process, we considered proportional error variance to also vary across individuals.18 Where individuals were administered propofol on multiple occasions, each occasion was treated as a separate individual. Model estimation and evaluation was performed using NONMEM version 7.2 (Icon Development Solutions, Ellicott City, MD) using the FOCE method with interaction and PLT-Tools version 4 (PLTsoft, San Francisco, CA).

Model parameters were calculated relative to a reference individual: 70-kg, 35-year-old, 170-cm male patient. During hierarchical model building, parameters were added and removed from the model to obtain a good model fit using the corrected Akaike information criteria (AIC). Potential covariate relationships were determined by examination of post hoc variability (η) values and tested for inclusion in the model. Height information was not available for all individuals. For post hoc η versus height analysis, these individuals were ignored. Without height, body mass index (BMI) cannot be calculated, but we assumed a BMI of 24.2 kg/m2 for these individuals. This was necessary because the predictive performance metric (described in the following section) stratifies performance across groups based on age and BMI. Thus, individuals missing height information were assumed to be nonobese (BMI <30).

At each step, we also estimated the predictive performance of the model for intraoperative clinical conditions; the method is detailed in the following section. Linear and allometric methods were used to scale parameters to an individual’s size. Total body weight (WGT) was used as a size descriptor, and allometric scaling exponents were estimated when appropriate. We did not consider lean body mass or fat-free-mass as size descriptors because currently none are available for the age- and weight range studied. An individual’s age, sex, BMI (or estimated BMI), and clinical status (patient or healthy volunteer) were considered as possible covariates for model parameters.

Intraoperative Predictive Performance

A good model fits the data in a well-balanced manner. This can be difficult to achieve with large real-world datasets when one relies solely on AIC for model development. One problem is that a model modification can improve model fit for a few influential (possibly outlier) individuals or observations to the detriment of many others. When these come from outside the operating region of an application of the model, then predictive performance can be degraded even while AIC suggests model improvement. Another problem can be caused by imbalance between subgroups in the data, that is, children, adults, elderly, etc. Our goal is good performance for all subgroups, but adults have a higher contribution to AIC because of the greater number of adults in the dataset. Model modifications can improve model fit for adults to the disproportionate detriment to other subgroups. To reduce these problems, we focused model evaluation on intraoperative conditions that is the intended application of the model and constructed a predictive performance metric that is balanced between subgroups. We used this metric along with AIC to guide model development. At each step, we considered models with a decrease or moderate increase in AIC along with little or no degradation in the predictive performance metric.

The intended application of our model is for patients undergoing surgical procedures, and thus, our measures of performance should focus on intraoperative conditions. However, some of the individuals studied were healthy volunteers, and some observations concern very high or low drug concentrations. Studies in healthy volunteers avoid variability in surgical stimulation, concomitant medications, and in the general health of patients. The patterns of blood sampling may also differ. With the use of data from healthy volunteers to estimate performance might give clinicians an unrealistic impression of the performance they may expect to obtain. Similarly, clinicians are unlikely to target very high or very low drug concentrations, and thus, predictive performance in those ranges is not relevant for performance estimates. To address these issues, we excluded data from healthy volunteers as well as observations outside the range 0.5 to 8 μg/mL from our estimates of intraoperative performance. Drug concentrations may transiently exceed 8 μg/mL after rapid administration especially during induction, but these are likely to be dominated by front-end kinetics and thus poorly predicted by a 3-compartmental model.19–21 For the same reason, we also ignored predictions in the initial 2 minutes after the start of the initial infusion. These exclusions concern only estimates of intraoperative predictive performance; all the available data were used for model estimation and AIC calculation.

To quantify the predictive performance of the models, we calculated the performance error22 (PE) and absolute performance error (APE) as:

Population models are often compared based on median prediction error (MDPE) and median absolute prediction error (MDAPE) as measures of accuracy and precision. However, a shortcoming of this approach is that it is insensitive to a large portion of PEs. Two models can have the same median performance value while differing greatly for PEs higher or lower than the median. To reduce the impact of this shortcoming, we defined 2 thresholds for APE: (1) “good” PE corresponding to APE ≤ 20% and (2) “poor” PE corresponding to APE >60%. For a given set of considered observations, we defined the predictive performance metric as the percentage of observations with “good” performance minus the percentage of observations with “poor” performance. Optimal performance corresponds to maximizing this performance metric, that is, maximizing the proportion of good predictions while simultaneously minimizing the proportion of poor predictions. Perfect performance would be 100% (all predictions have APE ≤20%), while the worst would be −100% (all predictions have APE >60%).

During model building, we used predictions from 2-fold cross-validation as a guard against overfitting. With this method, the dataset is split into 2 parts: D1 and D2. To evaluate a given model structure, its parameters are estimated using D1; the parameters are fixed and then used to predict D2. The process is repeated, exchanging D1 and D2. Predictions for D1 and D2 are combined to obtain a complete set of independent predictions. Individuals for D1 and D2 were chosen randomly with a probability of 0.5 before analysis. When comparing performance of the final population model with published models, then no cross-validation was performed.

We estimated predictive performance separately for 5 subgroups: young children (age <3 years), children (3 ≤ age <18 years), adults (18 ≤ age <70 years, BMI <30), elderly (age ≥70 years), and high-BMI individuals (BMI ≥30). For model development, the overall predictive performance metric was averaged over these 5 subgroups. Equal weighting for the subgroups helps reduce the influence that demographic imbalance can exert on the performance metric. This imbalance can negatively influence model performance on the underrepresented subgroup.4 Subgroups were chosen to approximately mirror the propofol PK models in the literature; these are often specialized to children, adults, or the obese.

Comparison of Intraoperative Predictive Performance

We used the population predictions of the final model to compare estimated intraoperative predictive performance with 20 models from the literature for each of the 5 subgroups. Table 2 shows the PK models considered. We also performed a similar comparison of the complete dataset without exclusions.

Table 2
Table 2:
PK Models of Propofol for Performance Comparison


The aggregated dataset contains 10,927 drug concentration observations from 684 occasions from 660 individuals (433 men, 227 women). Individuals were patients (500) or healthy volunteers (160). Propofol drug concentrations were determined from 7024 arterial samples (354 individuals) and 3903 venous samples (306 individuals). Age range was from 0.25 to 88 years, weight range was from 5.2 to 160 kg, and BMI range was from 11.2 to 53 kg/m2. Figure 1 shows the distribution of age, weight, height, and BMI of the individuals. Height information was missing for 108 individuals in the age range 17 to 88 years. One dataset (Swinhoe et al.16) used TCI administration, but only TCI targets were available and the infusion profiles were estimated using deconvolution. For cross-validation, dataset D1 contained 5562 observations from 347 occasions, and D2 contained 5365 observations from 337 occasions. Estimates of intraoperative predictive performance used 5492 observations. The complete dataset used is available in the supplementary data (Supplement 1,

Figure 1
Figure 1:
Histograms for age, weight, height, and BMI.

Some questionable data were identified; some could be corrected, but others were ignored. These primarily concern drug infusion records that overlap subsequent infusions. This would require multiple simultaneous propofol infusions in a single patient, and this seems unlikely to have occurred. Notes regarding errors and corrections are available in the supplementary data (Supplement 2, Table 3 shows the composition of subgroups used for predictive performance evaluation. One patient was in both the elderly and high-BMI subgroups.

Table 3
Table 3:
Composition of Subgroups Used for Intraoperative Predictive Performance Evaluation


A graphical description of the hierarchical model-building process can be seen in Figure 2. For the initial model (Model 1), all parameters scaled linearly with WGT. Performance was poor, especially at the extremes of weight. The addition of allometric concepts of scaling clearances to a power exponent of 0.75 resulted in an improved model (Model 2). Individuals from studies of patients appear to have different parameters than those from studies of healthy volunteers. The model was expanded using separate population typical values and variance estimates for patients and healthy volunteers (Model 3). Most parameters appear to decline with age, and an exponential decay relative to age was also added to the model, resulting in an improved model fit and improved overall prediction accuracy (Model 4).

Figure 2
Figure 2:
Hierarchical model building.

Canonical Versus Compartmental Approach to Allometric Scaling

When the body size descriptor is WGT and the reference size is 70 kg, then the “canonical” approach to allometric scaling is:

The scaling exponents of 1 for volumes and 0.75 for clearances are sometimes referred to as theoretical values for scaling exponents.23

We also considered a “compartmental” approach to allometric scaling in which Q2 and Q3 are scaled with respect to the normalized estimated size of the corresponding compartment:

This approach assumes that an individual’s estimated compartmental volume is a better size descriptor for the corresponding intercompartmental clearance than WGT. This is consistent with the idea that individuals of the same WGT may have different compartmental volumes and that intercompartmental clearance should be scaled according to the size of the associated compartment in that individual. A reviewer described this approach as theory-based allometry for the fixed effect on Q.

Application of the compartmental approach to allometric scaling lead to a lower objective function and improved predictive performance (Model 5) compared the canonical approach. We considered both approaches in separate branches of hierarchical model building. For brevity, we only describe the model-building steps for the compartmental approach (Models 16–27), but the canonical approach is also shown in Figure 2 (Models 6–15).

The scaling of V1, V3, CL, and Q2 differed from the theoretic allometric rule applied. We expanded the model to use a weight-dependent exponent method suggested by Wang et al.8 that applies separate scaling exponents for different phases of development. We considered a development phase for low weights and an adult phase for higher weights. Later in the model-building process, we considered a single scaling exponent in combination with sigmoidal functions to describe development, an approach suggested by Anderson et al.24

We found that good model fits and prediction performance could be obtained by using scaling exponents different than theoretical values for V3 and CL for individuals with low weights and V1, V3, and CL for individuals of high weights (Model 16). The scaling coefficient for high weights for V1 was close to 0, indicating constant V1 with respect to weight in the adult phase (Model 17). With the use of age to differentiate the 2 phases of development (Model 18) produced similar but slightly poorer quality model fit compared with using weight.

V1 appears smaller in women than men, for both the compartmental and canonical approaches. Expanding the models to estimate decreased values for V1 in women resulted in better model fit and better prediction performance (Model 19).

The estimated age covariate for V3 was close to 0 and fixing it at 0 allowed for a reduced model with slightly improved AIC and similar predictive performance (Model 20). Similarly, the estimated population variance for V2, V3, CL, and Q2 did not differ significantly between patients and healthy volunteers. Estimating shared population variances for these parameters lead to an improved AIC and predictive performance (Model 21). The population variance of Q3 for patients was very small, and setting this parameter to 0 afforded similar predictive performance and a very slight increase in AIC (Model 22).

CL appears higher in women than in men for adult individuals, but an opposite relationship appears in the elderly, and no difference is apparent for young individuals. Adding these elements to the model resulted in lower AIC values and improved prediction performance. It is interesting to note that when these modifications are added, the estimated scaling exponents for CL are close to 0.75, the theoretical values. The scaling exponent for CL was fixed to 0.75 without degrading AIC or model predictive performance (Model 23). We did not observe any systematic deviation from the population model for clearance for small children. Peeters et al.25 found that allometric scaling only overestimated clearance in neonates <5 kg bodyweight, and this is less than the smallest individuals in the current dataset. It seems that maturation of propofol clearance occurs for children younger/smaller than those studied here.

Both V1 and Q3 decrease to low values for small individuals. Both parameters can be multiplied by a sigmoidal development function, an approach suggested by Anderson et al.24 This allows the use of a constant value for V1 with respect to weight for adults and a single scaling exponent for V3 for all sizes. The AIC is moderately increased, but predictive performance is improved (Model 24).

Q2 shows systematic deviation from the population model for small individuals. Allowing Q2 to show different age–covariate relationships for small and large individuals allowed for improved AIC and predictive performance (Model 25). We also found that the age covariates for V1 and CL, and Q3 and Q2 for large individuals could be combined and estimated simultaneously. This reduced model shows a slightly increased AIC value and slightly decreased predictive performance (Model 26).

Expanding the model to allow proportional residual error to also vary between individuals18 as well as between component datasets allowed for a large decrease in AIC. The predictive performance metric improved, and we accepted this modification into the model (Model 27). For the canonical allometric scaling branch, this modification also allowed for improved predictive performance (Model 15), provided it was applied after the modification of CL for sex (Model 12).

Additional model testing was performed (data not shown). Adding sampling method (arterial versus venous) as a covariate to V1 did not result in an improved model. Replacing the patient versus healthy volunteer covariate for V1 with sampling method showed an improvement in predictive performance but also a large (>100) increase in AIC. Imbalance in the data may play a role because all the healthy volunteer studies used arterial sampling, and none used venous sampling. If no distinction is made between patients and healthy volunteers, then AIC and predictive performance are degraded. Estimating parameter correlation in η by using a block covariance matrix allowed for a large decrease in objective function, but the predictive performance metric was degraded, suggesting overfitting. Other model modifications were found that lead to lower AIC values, but none improved model predictive performance.

The full NONMEM model code of the final model with annotations can be found in the supplementary data (Supplement 3, Post hoc η versus weight and age are shown in Figure 3 and Figure 4, respectively. Uncertainty in the estimated parameters was evaluated using likelihood profiles, and these are shown in Figure 5. Various model diagnostic plots and bootstrap resampling results are included in the supplementary data (Supplement 4, The summarized equations of the final model are:

Figure 3
Figure 3:
Post hoc η versus weight graphs show no clear relationships between post hoc η and weight for the model parameters.
Figure 4
Figure 4:
Post hoc η versus age graphs show no clear relationships between post hoc η and age for the model parameters.
Figure 5
Figure 5:
Likelihood profiles show changes in objective function value when fixing model parameters at particular values. A “flat” or “shallow” likelihood profile suggests problems with parameter identification. The parameter interval where likelihood profile is within the dark shaded region corresponds to P > 0.05 (Δ objective function <3.84). The light shaded region corresponds to P > 0.01 (Δ objective function <6.63). The likelihood profiles suggest no problems with parameter identification for the final model.

Where η1–η8 represent random variable of variances denoted in Table 4. AGE, WGT, and PMA represent an individual’s age in years, weight in kilogram, and postmenstrual age in years (AGE + 40/52), respectively. RES is the proportional residual error for each component dataset. Error variance ε1 was fixed to a variance of 1, and ε2 was estimated from the data. Constants with a subscript ref are calculated for the reference individual.

Table 4
Table 4:
Estimated Population Variances in the Model. CV(%) is Calculated as Sqrt(Exp(ω) − 1)*100%

Figure 6 shows population and post hoc predictions versus time and observed propofol plasma concentrations for the entire dataset. Quantiles were calculated by grouping samples per minute. If <20 samples are available, the timeframe was expanded equally forward and backward. The time of the calculated quantiles was adjusted to be the average of the contributing samples. There is some bias in early samples and very late samples. Bias in early samples suggests that front-end kinetics is not well modeled by the 3-compartment model. Bias in the very late samples suggests inaccuracy in estimating the terminal phase. These late samples concern concentrations lower than about 0.025 μg/mL and >10 hours after initial propofol administration, so they are not relevant to most clinical intraoperative applications of TCI. Figure 7 shows population post hoc predictions for some patient subgroups compared with some PK models used in clinical practice.

Figure 6
Figure 6:
Population and post hoc predictions for the current study versus time and observed propofol plasma concentration. Data from individuals are gray. In upper panels, black lines correspond to (smoothed over time) 10%, 25%, 50%, 75%, and 90% quantiles. In lower panels, black line is a Loess smoother. Shaded area denotes the range of observations in the intraoperative range.
Figure 7
Figure 7:
Population predictions of the final model versus time for some patient subgroups compared with some PK models used in clinical practice. Data from individuals are gray. Black lines correspond to (smoothed over time) 10%, 25%, 50%, 75%, and 90% quantiles. MDPE and MDAPE are shown for patient data (“Patient data”) and for data used for predictive performance evaluation (“Clinical data,” t, >2 minutes, 0.5 ≤ obs ≤ 8.0 μg/mL).

Figure 8 shows the accuracy of population predictions in the first 15 minutes of drug administration for arterial and venous samples. For these early samples, the final model overpredicts drug concentrations for both arterial and venous samples.

Figure 8
Figure 8:
Accuracy of population predictions in the first 15 minutes of drug administration for arterial and venous samples. Grey lines are data from individuals, and solid black lines show the 25%, 50%, and 75% percentiles of observations for each minute. The final model overpredicts drug concentrations for both arterial and venous samples in these early samples.

Figure 9 shows the post hoc estimated model parameters plotted against weight. A plateau in the estimated value for V1 can be seen for weight higher than about 30 kg. Differences between patients and healthy volunteers are clearly visible. Figure 10 shows the post hoc estimated model parameters plotted against age. The decline in parameter values from adults to elderly is visible for most parameters.

Figure 9
Figure 9:
Post hoc estimated model parameters plotted against patient weight. Patients indicated by “+” and healthy volunteers by “•”. Differences between patients and healthy volunteers are clearly visible.
Figure 10
Figure 10:
Post hoc estimated model parameters plotted against age. Patients indicated by “+” and healthy volunteers by “•”.


Table 5 shows the estimation of intraoperative predictive performance of the 5 best models for each subgroup. The final population model performs better than all specialized models with the exception of the Paedfusor model in children; however, the performance is similar in this subgroup. Table 6 shows the predictive performance of the final model for the complete dataset without exclusions. The final model performs better than all the specialized models. On the complete dataset, the final model achieved MDPE of −1.4% and MDAPE of 21.5%. Overall, performance is the best for young children and children, while the poorest performance was in the high-BMI group. We see that MDAPE in general follows the same pattern as the predictive performance metric, with the short model in children and young children being a counter example.

Table 5
Table 5:
Estimates of Intraoperative Predictive Performance for the Best 5 Population Models (Sorted by Metric) for Each of the Subgroups
Table 6
Table 6:
Predictive Performance on the Complete Dataset for the Best 5 Population Models (Sorted by Metric) for Each of the Subgroups


The final model consists of a 3-compartmental model of volumes and clearance. Parameters varied with individual weight, age, and sex. Allometric scaling was shown to be beneficial. We also found an influence of the study design as patients during clinical anesthesia behave differently than healthy volunteers.

Patients Versus Healthy Volunteers

The observed difference between patients and healthy volunteers is probably related to concomitant medication, although we cannot exclude the possibility that it is related to the general health of the individual or the anesthetic technique applied. Patients are much more likely to receive premedication, muscle relaxants and analgesic drugs, in particular opioids. These may directly or indirectly influence the PK of propofol. For example, midazolam has been shown to reduce CL, intercompartmental clearances, and peripheral volume of distribution of propofol.26,27 Propofol concentrations have been found to be higher in the presence of fentanyl28 and alfentanil,29 also suggesting reduced clearance. We found decreased CL, Q2, Q3, and V3 in patients compared with healthy volunteers, so it is possible that these differences could be due to opioid use. Unfortunately, the concomitant medications administered are not sufficiently documented in the original studies for straightforward inclusion in the model-building process. Therefore, the current model relies on the patient versus healthy volunteer distinction as a proxy for this information, and we cannot identify what mechanism underlies the observed differences.

Because the patient versus healthy volunteer distinction may be a proxy for information about concomitant medication, we suggest that the PK parameters for patients should be used in nearly all clinical situations. Only in the case of good general health and an absence of concomitant medication should the PK parameters for healthy volunteers be considered.

Scaling of V1 and Initial Loading Dosages

The scaling of V1 for different-sized individuals has important implications for anesthesia. Clinicians usually administer an initial propofol bolus dosage whose size is based on the weight of the patient, aiming to achieve an optimal balance between rapid induction and adverse effects. This practice is based on the assumption that V1 is some function of body weight. For TCI systems, the initial loading dosage is calculated as the product of V1 and the initial plasma target concentration. However, no consensus has yet emerged how to scale V1 (and thus initial loading dosage) with respect to body size. The Schnider model does not scale V1 according to body weight; it ascribes the same value for V1 for individuals of all sizes. This approach does not extrapolate to small-sized individuals, because it suggests the same initial loading dosage in children as adults, a practice at odds with clinical experience. In contrast, in the Marsh model, V1 is a linear function of WGT, which can lead to excessively large initial loading dosages in the obese.30 Our analysis used the original data from both of these studies and found that V1 does indeed appear independent of weight for weights >about 30 kg, and is lower for smaller individuals. This is clearly visible in Figure 9. One reason this characteristic may have been obscured in previous patient studies is the considerably higher interindividual variability in V1 in patients compared with volunteers. This is shown in Table 4 and clearly visible in Figure 9 and Figure 10.

Figure 8 shows that the final model overpredicts the first 15 minutes of drug administration, for both arterial and venous samples. This suggests that initial drug distribution is influenced by front-end kinetics, as described in published front-end kinetic models.21,31 Incorporating such mechanisms into the model may allow further improvements in predictive performance. However, the model structure would be different than the 3-compartmental model in wide use in TCI, resulting in a less straightforward application to TCI. For the same reason, we did not consider more complex compartmental models or physiologically based models.

Influence of Aging

We found that all the model parameters, with the exception of V3, appear to decrease with increasing age (Fig. 10). Thus, loading dosages and maintenance dosage rates should be decreased for older individuals.

Influence of Sex

We found a smaller V1 for women compared with men and higher CL for the age range 22 to 69 years. For V1, a simple scaling factor was sufficient, but for CL, a more complex function must be applied to adjust for sex. Higher CL in women within this age range may play a role in more rapid emergence from propofol anesthesia.32 In the elderly, we found that CL was lower in women, as found by White et al.17 whose study data are used in our study. This is different to the finding of Vuyk et al.33 who showed higher CL for elderly women compared with elderly men.

Allometric Scaling

Allometric scaling predicts the value of individual patient parameters based on an individual’s size. Concepts from allometric scaling as described by Anderson and Holford23 were used in our model. With some perhaps arguably minor exceptions, we found that the theoretical scaling exponents of 1 for volumes and 0.75 for clearances performed well as found in other studies.11 The exceptions were that V1 reaches a plateau at around 30 kg, and for V3, we found an exponent of approximately 0.35, a smaller exponent than expected from allometric theory. One hypothesis is that V1 is the rapidly circulating portion of blood volume and thus may be more strongly determined by hemodynamic variables such as heart rate, mean arterial blood pressure, cardiac output, and so on than simply by total body weight. This hypothesis is supported by the demonstration by Vuyk of a relationship between V1 and mean arterial blood pressure.27V3 is commonly believed to represent total body fat mass; however, we found that V3 grows more slowly with increasing weight than might be expected for fat mass. There is evidence that the volume of distribution of propofol is actually much larger, up to 1390 to 3940 L in adults,34 a finding that used blood sampling for 5 days after drug administration. It is possible our estimation of V3 is limited by the much shorter sampling period used in the analyzed data. Another possibility is that the anesthetic technique preferred in the obese may be associated with lower estimates of V3.

The compartmental approach to allometric scaling is an attempt to untangle allometric scaling from the variability in body composition across a population. It presents the idea that an individual’s body composition is to some degree reflected in the sizes of V2 and V3 and that these volumes are the primary determinant of Q2 and Q3, not body size. This contrasts with the canonical approach where Q2 and Q3 are determined by a single shared body size descriptor and are not influenced by V2 and V3. The current study provides some evidence for compartmental allometric scaling based on its advantages for prediction and model fitting. Other studies may support or refute it as a general phenomenon.

Interparameter relationships are sometimes addressed in NONMEM by estimating covariances in η. These are observational associations without any underlying theory because η represents unexplained population variability. Compartmental allometry is part of the structural model and predicts a relationship between compartmental clearances and their corresponding volumes. The equations can be written so that these relationships are explicit. Consider compartmental allometric scaling where the scaling exponent of the volume is estimated, and an exponential age model is also estimated:

Substitute V into Q and simplify leads to:

This suggests that V and Q are positively correlated because they are both influenced by η1, and the magnitude of the correlation depends on the variances of η1 and η2.

Predictive Performance for Different Subgroups

Estimated intraoperative predictive performance of the final model is better than or similar to that of specialized models, even for the subpopulations on which those models were derived. We found MDAPE lower than 20% for young children, children, adults, and the elderly. High-BMI individuals showed poorer performance but were still in the range considered clinically acceptable (MDPE <10–20% and MDAPE between 20% and 40%).35,36

High-BMI individuals seem to exhibit greater interindividual variability, probably due to the considerable variability in body composition in the group and a paucity of information about body composition in the covariates available. Propofol PKs are likely to be different in individuals with high body mass as a result of obesity (excess adipose) as opposed to very muscular individuals. Nonetheless, most studies of propofol PK treat obesity as a body size issue rather than a body composition issue.37 Our model also does this. It seems that BMI does not provide sufficient information about body composition to improve the model. A different approach is needed either in the model structure or in the covariates collected for predictive performance in individuals with BMI >30 to approach that of those with BMI <30.

Application to TCI

One of the potential applications of our model is intraoperative TCI. Figure 11 illustrates predicted cumulative dosing required to achieve 3 μg/mL plasma concentration of propofol in representative (patient) individuals. Cumulative dosages are approximately similar to those required by the 5 best performing PK models from the relevant subgroups from Table 5, and thus, our model is likely to perform clinically well for TCI applications. Clinicians familiar with TCI with some other models in adult, elderly, and high-BMI individuals may require slightly higher target concentrations due to low bias in our model and positive bias in some other models. Differences in bias between models can be seen in Table 5 and Figure 8.

Figure 11
Figure 11:
Predicted cumulative dosage required to achieve 3 μg/mL plasma concentration of propofol in representatives of the subgroups considered.

Limitations of the Study

Patients in the age range of about 12 to 20 years are underrepresented in the data, so model performance in this range may be poorer than expected. All the young children in the study come from a single study, and our results may be biased in that range by the anesthetic technique used. Concomitant medications, especially opioids, are likely responsible for the differences observed between patients and healthy volunteers, but the existing documentation is insufficient to more specifically define the model. Also, the model does not describe front-end kinetics and as expected performs poorly in the first few minutes of drug administration. Interoccasion variability was not addressed in model development. During model building, we used PE-based calculations to guide model building as they are widely used in this field for performance evaluations. However, shortcomings of these error calculations have been noted.19 Similarly, the thresholds for “good” and “poor” prediction error used in the predictive performance metric are arbitrary. If a different performance metric is used, then a different model might be considered optimal. We did not perform prospective evaluation of model performance in the current study, and this should be done to fully evaluate the model.


In conclusion, we developed a multicompartmental PK model for propofol with acceptable performance accuracy that might be useful in a wide range of patients, differing in age, sex, and weight. More specifically, our model is applicable to pediatric, adult, elderly patients, and in lean and obese patients. Caution is always required when using this or any model in patients at the physiological extreme. Prospective validation of our model in clinical practice will be required to prove its safety and applicability.


Name: Douglas J. Eleveld, PhD.

Contribution: This author helped design and conduct the study, analyze the data, and write the manuscript.

Attestation: Douglas J. Eleveld has seen the original study data, reviewed the analysis of the data, approved the final manuscript, and is the author responsible for archiving the study files.

Name: Johannes H. Proost, PhD.

Contribution: This author helped design and conduct the study, analyze the data, and write the manuscript.

Attestation: Johannes H. Proost has seen the original study data, reviewed the analysis of the data, and approved the final manuscript.

Name: Luis I. Cortínez, MD.

Contribution: This author helped design the study, analyze the data, and write the manuscript.

Attestation: Luis I. Cortínez has seen the original study data, reviewed the analysis of the data, and approved the final manuscript.

Name: Anthony R. Absalom, MD.

Contribution: This author helped design the study, analyze the data, and write the manuscript.

Attestation: Anthony R. Absalom has seen the original study data, reviewed the analysis of the data, and approved the final manuscript.

Name: Michel M. R. F. Struys, MD.

Contribution: This author helped design the study, analyze the data, and write the manuscript.

Attestation: Michel M. R. F. Struys has seen the original study data, reviewed the analysis of the data, and approved the final manuscript.

This manuscript was handled by: Steven L. Shafer, MD.


This work would not have been possible without those behind the Open TCI Initiative Web site (, especially Drs Steven L. Shafer, Charles F. Minto, and Thomas W. Schnider, and those who contributed the numerous datasets. Dr. Martin White and Dr. John (Iain) Glen also contributed datasets outside the Open TCI database. The approach of open sharing of data without conditions by all contributors made it possible for us to focus our efforts on testing model structures and hypotheses and to build on the considerable work of those who collected the datasets. We extend deep appreciation to all the researchers, clinicians, patients, healthy volunteers, and others who directly and indirectly contributed.


1. Struys MM, Sahinovic M, Lichtenbelt BJ, Vereecke HE, Absalom AR. Optimizing intravenous drug administration by applying pharmacokinetic/pharmacodynamic concepts. Br J Anaesth. 2011;107:38–47
2. Coppens MJ, Eleveld DJ, Proost JH, Marks LA, Van Bocxlaer JF, Vereecke H, Absalom AR, Struys MM. An evaluation of using population pharmacokinetic models to estimate pharmacodynamic parameters for propofol and bispectral index in children. Anesthesiology. 2011;115:83–93
3. Marsh B, White M, Morton N, Kenny GN. Pharmacokinetic model driven infusion of propofol in children. Br J Anaesth. 1991;67:41–8
4. Vuyk J, Schnider T, Engbers F. Population pharmacokinetics of propofol for target-controlled infusion (TCI) in the elderly. Anesthesiology. 2000;93:1557–60
5. Absalom AR, Mani V, De Smet T, Struys MM. Pharmacokinetic models for propofol–defining and illuminating the devil in the detail. Br J Anaesth. 2009;103:26–37
6. Han PY, Kirkpatrick CM, Green B. Informative study designs to identify true parameter-covariate relationships. J Pharmacokinet Pharmacodyn. 2009;36:147–63
7. Schüttler J, Ihmsen H. Population pharmacokinetics of propofol: a multicenter study. Anesthesiology. 2000;92:727–38
8. Wang C, Peeters MY, Allegaert K, Blussé van Oud-Alblas HJ, Krekels EH, Tibboel D, Danhof M, Knibbe CA. A bodyweight-dependent allometric exponent for scaling clearance across the human life-span. Pharm Res. 2012;29:1570–81
9. Cortínez LI, Anderson BJ, Penna A, Olivares L, Muñoz HR, Holford NH, Struys MM, Sepulveda P. Influence of obesity on propofol pharmacokinetics: derivation of a pharmacokinetic model. Br J Anaesth. 2010;105:448–56
10. van Kralingen S, Diepstraten J, Peeters MY, Deneer VH, van Ramshorst B, Wiezer RJ, van Dongen EP, Danhof M, Knibbe CA. Population pharmacokinetics and pharmacodynamics of propofol in morbidly obese patients. Clin Pharmacokinet. 2011;50:739–50
11. Knibbe CA, Zuideveld KP, Aarts LP, Kuks PF, Danhof M. Allometric relationships between the pharmacokinetics of propofol in rats, children and adults. Br J Clin Pharmacol. 2005;59:705–11
12. Anderson BJ, Holford NH. Mechanism-based concepts of size and maturity in pharmacokinetics. Annu Rev Pharmacol Toxicol. 2008;48:303–32
13. Coetzee JF. Allometric or lean body mass scaling of propofol pharmacokinetics: towards simplifying parameter sets for target-controlled infusions. Clin Pharmacokinet. 2012;51:137–45
14. Sepúlveda P, Cortínez LI, Sáez C, Penna A, Solari S, Guerra I, Absalom AR. Performance evaluation of paediatric propofol pharmacokinetic models in healthy young children. Br J Anaesth. 2011;107:593–600
15. Servin F, Cockshott ID, Farinotti R, Haberer JP, Winckler C, Desmonts JM. Pharmacokinetics of propofol infusions in patients with cirrhosis. Br J Anaesth. 1990;65:177–83
16. Swinhoe CF, Peacock JE, Glen JB, Reilly CS. Evaluation of the predictive performance of a ‘Diprifusor’ TCI system. Anaesthesia. 1998;53(Suppl 1):61–7
17. White M, Kenny GN, Schraag S. Use of target controlled infusion to derive age and gender covariates for propofol clearance. Clin Pharmacokinet. 2008;47:119–27
18. Karlsson MO, Jonsson EN, Wiltse CG, Wade JR. Assumption testing in population pharmacokinetic models: illustrated with an analysis of moxonidine data from congestive heart failure patients. J Pharmacokinet Biopharm. 1998;26:207–46
19. Masui K, Upton RN, Doufas AG, Coetzee JF, Kazama T, Mortier EP, Struys MM. The performance of compartmental and physiologically based recirculatory pharmacokinetic models for propofol: a comparison using bolus, continuous, and target-controlled infusion data. Anesth Analg. 2010;111:368–79
20. Struys MM, Coppens MJ, De Neve N, Mortier EP, Doufas AG, Van Bocxlaer JF, Shafer SL. Influence of administration rate on propofol plasma-effect site equilibration. Anesthesiology. 2007;107:386–96
21. Masui K, Kira M, Kazama T, Hagihira S, Mortier EP, Struys MM. Early phase pharmacokinetics but not pharmacodynamics are influenced by propofol infusion rate. Anesthesiology. 2009;111:805–17
22. Varvel JR, Donoho DL, Shafer SL. Measuring the predictive performance of computer-controlled infusion pumps. J Pharmacokinet Biopharm. 1992;20:63–94
23. Anderson BJ, Holford NH. Mechanism-based concepts of size and maturity in pharmacokinetics. Annu Rev Pharmacol Toxicol. 2008;48:303–32
24. Anderson BJ, Allegaert K, Holford NH. Population clinical pharmacology of children: modelling covariate effects. Eur J Pediatr. 2006;165:819–29
25. Peeters MY, Allegaert K, Blussé van Oud-Alblas HJ, Cella M, Tibboel D, Danhof M, Knibbe CA. Prediction of propofol clearance in children from an allometric model developed in rats, children and adults versus a 0.75 fixed-exponent allometric model. Clin Pharmacokinet. 2010;49:269–75
26. Mertens MJ, Olofsen E, Burm AG, Bovill JG, Vuyk J. Mixed-effects modeling of the influence of alfentanil on propofol pharmacokinetics. Anesthesiology. 2004;100:795–805
27. Vuyk J, Lichtenbelt BJ, Olofsen E, van Kleef JW, Dahan A. Mixed-effects modeling of the influence of midazolam on propofol pharmacokinetics. Anesth Analg. 2009;108:1522–30
28. Cockshott ID, Briggs LP, Douglas EJ, White M. Pharmacokinetics of propofol in female patients. Studies using single bolus injections. Br J Anaesth. 1987;59:1103–10
29. Pavlin DJ, Coda B, Shen DD, Tschanz J, Nguyen Q, Schaffer R, Donaldson G, Jacobson RC, Chapman CR. Effects of combining propofol and alfentanil on ventilation, analgesia, sedation, and emesis in human volunteers. Anesthesiology. 1996;84:23–37
30. Absalom AR, Mani V, De Smet T, Struys MM. Pharmacokinetic models for propofol–defining and illuminating the devil in the detail. Br J Anaesth. 2009;103:26–37
31. Upton RN, Ludbrook G. A physiologically based, recirculatory model of the kinetics and dynamics of propofol in man. Anesthesiology. 2005;103:344–52
32. Hoymork SC, Raeder J, Grimsmo B, Steen PA. Bispectral index, serum drug concentrations and emergence associated with individually adjusted target-controlled infusions of remifentanil and propofol for laparoscopic surgery. Br J Anaesth. 2003;91:773–80
33. Vuyk J, Oos2uder CJ, Vletter AA, Burm AG, Bovill JG. Gender differences in the pharmacokinetics of propofol in elderly patients during and after continuous infusion. Br J Anaesth. 2001;86:183–8
34. Morgan DJ, Campbell GA, Crankshaw DP. Pharmacokinetics of propofol when given by intravenous infusion. Br J Clin Pharmacol. 1990;30:144–8
35. Schüttler J, Kloos S, Schwilden H, Stoeckel H. Total intravenous anaesthesia with propofol and alfentanil by computer-assisted infusion. Anaesthesia. 1988;43(Suppl):2–7
36. Glass PS, Shafer S, Reves JGMiller RD. Intravenous drug delivery systems. Miller’s Anesthesia. 2005 Philadelphia, PA Elsevier (Churchill Livinstone):439–80
37. Eleveld DJ, Proost JH, Absalom AR, Struys MM. Obesity and allometric scaling of pharmacokinetics. Clin Pharmacokinet. 2011;50:751–3
38. Bailey JM, Mora CT, Shafer SL. Pharmacokinetics of propofol in adult patients undergoing coronary revascularization. The Multicenter Study of Perioperative Ischemia Research Group. Anesthesiology. 1996;84:1288–97
39. Doufas AG, Bakhshandeh M, Bjorksten AR, Shafer SL, Sessler DI. Induction speed is not a determinant of propofol pharmacodynamics. Anesthesiology. 2004;101:1112–21
40. Doufas AG, Morioka N, Mahgoub AN, Bjorksten AR, Shafer SL, Sessler DI. Automated responsiveness monitor to titrate propofol sedation. Anesth Analg. 2009;109:778–86
41. Coetzee JF, Glen JB, Wium CA, Boshoff L. Pharmacokinetic model selection for target controlled infusions of propofol. Assessment of three parameter sets. Anesthesiology. 1995;82:1328–45
42. Gepts E, Camu F, Cockshott ID, Douglas EJ. Disposition of propofol administered as constant rate intravenous infusions in humans. Anesth Analg. 1987;66:1256–63
43. Kataria BK, Ved SA, Nicodemus HF, Hoy GR, Lea D, Dubois MY, Mandema JW, Shafer SL. The pharmacokinetics of propofol in children using three different data analysis approaches. Anesthesiology. 1994;80:104–22
44. Schnider TW, Minto CF, Gambus PL, Andresen C, Goodale DB, Shafer SL, Youngs EJ. The influence of method of administration and covariates on the pharmacokinetics of propofol in adult volunteers. Anesthesiology. 1998;88:1170–82
45. Servin F, Farinotti R, Haberer JP, Desmonts JM. Propofol infusion for maintenance of anesthesia in morbidly obese patients receiving nitrous oxide. A clinical and pharmacokinetic study. Anesthesiology. 1993;78:657–65
46. Servin FS, Bougeois B, Gomeni R, Mentré F, Farinotti R, Desmonts JM. Pharmacokinetics of propofol administered by target-controlled infusion to alcoholic patients. Anesthesiology. 2003;99:576–85
47. Struys MM, Coppens MJ, De Neve N, Mortier EP, Doufas AG, Van Bocxlaer JF, Shafer SL. Influence of administration rate on propofol plasma-effect site equilibration. Anesthesiology. 2007;107:386–96
48. Servin F, Cockshott ID, Farinotti R, Haberer JP, Winckler C, Desmonts JM. Pharmacokinetics of propofol infusions in patients with cirrhosis. Br J Anaesth. 1990;65:177–83
49. Swinhoe CF, Peacock JE, Glen JB, Reilly CS. Evaluation of the predictive performance of a ‘Diprifusor’ TCI system. Anaesthesia. 1998;53(Suppl 1):61–7
50. White M, Kenny GN, Schraag S. Use of target controlled infusion to derive age and gender covariates for propofol clearance. Clin Pharmacokinet. 2008;47:119–27
51. Wietasch JK, Scholz M, Zinserling J, Kiefer N, Frenkel C, Knüfermann P, Brauer U, Hoeft A. The performance of a target-controlled infusion of propofol in combination with remifentanil: a clinical investigation with 2 propofol formulations. Anesth Analg. 2006;102:430–7
52. Absalom A, Amutike D, Lal A, White M, Kenny GN. Accuracy of the ‘Paedfusor’ in children undergoing cardiac surgery or catheterization. Br J Anaesth. 2003;91:507–13
53. Short TG, Aun CS, Tan P, Wong J, Tam YH, Oh TE. A prospective evaluation of pharmacokinetic model controlled infusion of propofol in paediatric patients. Br J Anaesth. 1994;72:302–6
54. Rigby-Jones AE, Nolan JA, Priston MJ, Wright PM, Sneyd JR, Wolf AR. Pharmacokinetics of propofol infusions in critically ill neonates, infants, and children in an intensive care unit. Anesthesiology. 2002;97:1393–400
55. Rigby-Jones A, Priston M, Wolf A, Sneyd J. Paediatric propofol pharmacokinetics: A multicentre study. Paediatr Anaesth. 2007;17:610
56. Shangguan WN, Lian Q, Aarons L, Matthews I, Wang Z, Chen X, Freemantle N, Smith FG. Pharmacokinetics of a single bolus of propofol in chinese children of different ages. Anesthesiology. 2006;104:27–32
57. Tackley RM, Lewis GT, Prys-Roberts C, Boaden RW, Dixon J, Harvey JT. Computer controlled infusion of propofol. Br J Anaesth. 1989;62:46–53

Supplemental Digital Content

© 2014 International Anesthesia Research Society