Anesthetic management has been improved in the previous decades. However, emergence agitation (EA) is still a common complication in children after general anesthesia. Children emerging from anesthesia often do not recognize their environment, may often be disoriented, extremely agitated, and behave violently. This behavior is called EA. In the early 1960s, Eckenhoff et al1 reported that patients anesthetized with cyclopropane or ketamine had a high incidence of EA. As halothane replaced cyclopropane or ketamine as an anesthetic for children, the incidence of EA has decreased. Recently, sevoflurane and desflurane have become common inhaled anesthetics, and the issue of EA in pediatric patients has reemerged.2 EA is a major source of anxiety for medical caregivers because it increases risks of self-injury and accidental removal of surgical dressings, intravenous catheters, and drains, and causes discomfort for parents. Thus, many studies for preventing EA have been conducted.3–6
The frequency of EA in children after general anesthesia has been reported to be 10%–50%.3,7–9 A variety of factors, including the patient,10–12 the surgical procedure,1,10,13,14 and the anesthesia technique,7,8,15,16 have been suggested to play a role in the development of EA. Although many patients have several concurrent risks, assessment of the impact of these risks on the development of EA has not been elucidated, and preventative strategy cannot be implemented. Therefore, we considered that an EA risk stratification model could help us to devise preventive strategies for EA in order to improve the quality of recovery from anesthesia in children.
The goal of this 2-phased study was (1) to develop a predictive model (EA risk scale) for the incidence of EA in children receiving sevoflurane anesthesia by performing a retrospective analysis of data from our previous study (phase 1; development phase) and (2) to confirm the validity of the EA risk scale in a prospective cohort study (phase 2; validation phase).
The study was approved by the Institutional Review Board of Kanagawa Children’s Medical Center (Yokohama, Japan, approval number 1405-04, 1405-05). The study protocol was registered with the University Hospital Medical Information Network Clinical Trial Registry (registration number UMIN000023190, principal investigator M. Hino, date of registration July 15, 2016). For phase 1 (secondary use of patient data from our previous trial17), both written informed consent for the trial17 and agreement to the secondary use of data had been simultaneously obtained from guardians of all patients. For phase 2 (prospective cohort study), written informed consent was obtained from guardians of all patients. This manuscript adheres to the applicable Equator guidelines.
Development Phase (Phase 1)
The purpose of the phase 1 was to develop an EA risk scale using the data from our previous randomized controlled trial.17 A total of 120 children (1.5–8 years of age), who were scheduled to undergo general anesthesia, participated in the study. Patients included in the study were classified as American Society of Anesthesiologists physical classification I or II. Patients with mental retardation or those who were taking psychotrophic drugs were excluded. Operative procedures included the following: inguinal hernia repair, adenoidectomy and/or tonsillectomy, unilateral strabismus surgery, cryptorchidism repair, tympanostomy tube insertion, and other minor surgeries. Patients received no premedication. Behavior at the time of induction was evaluated using a 4-point scale.18 All patients were induced with sevoflurane and nitrous oxide and maintained on sevoflurane. For pain management, patients received intravenous fentanyl and suppository acetaminophen, and nerve block, as appropriate. See Hijikata et al17 for detail. Upon arrival at the postanesthesia care unit, the anesthesiologists or a trained nurse determined the incidence of EA using the Pediatric Anesthesia Emergence Delirium (PAED) scale.19
Pediatric Anesthesia Behavior (PAB) scores20 and PAED scores, which are not usually recorded in daily practice, could be easily obtained through the secondary use of the randomized controlled trial data. Thus, multivariate logistic regression analysis could be performed using these parameters considered to be highly associated with the incidence of EA. The incidence of EA was defined as a PAED score over 12.19 Logistic regression analysis was used to predict the incidence of EA using the following 10 predictors: age, height, weight, sex, PAB score, operative procedure, anesthesia time, method of stabilizing airway (intubation or laryngeal mask airway), presence or absence of nerve block, and the total dose of fentanyl. The presence or absence of meridian stimulation was included in the logistic model as a modulator. The 4-point behavior scale18 was converted to the PAB score (3-point scale) by merging scale 2 and 3 of the 4-point scale for each individual. Because previous studies have identified tonsillectomy and strabismus surgery as risk factors for the development of EA,1,3,10,21 operative procedures were divided into 3 groups: tonsillectomy, strabismus surgeries, and other surgeries. Akaike information criterion (AIC)22 stepwise selection was used to determine the optimal combination of the predictors for predicting EA. The β-coefficient (ie, logarithm of the odds ratio) of each predictor was calculated and the EA risk score was determined by summing the value of the predictors.
A receiver operating characteristic (ROC) curve and the area under the curve (c-index)23 were used to assess the discrimination of the EA risk scale. The calibration was assessed using the Hosmer-Lemeshow goodness-of-fit test.
Validation Phase (Phase 2)
The purpose of phase 2 was to estimate the validity of the EA risk scale using a prospective observational design. From October 2015 to February 2016, 100 patients were enrolled in the study. The number of patients was determined by sample size calculation.
Inclusion and exclusion criteria and the anesthesia protocol were previously described.17 The incidence of EA was defined as a PAED score over 12. To assess the performance of the EA risk scale for EA prediction, an ROC curve was generated and the c-index calculated. The best cutoff point was determined, and the sensitivity and specificity at the point were calculated. The gray zone, where a prediction of EA was inconclusive, was also determined.24
All parametric data are expressed as mean ± standard deviation, and nonparametric data as median (interquartile range). Normality of data were tested using the Kolmogorov-Smirnov test. Categorical data are expressed as numbers and percentages.
In phase 1, univariable comparisons for 10 candidate predictors between patients with and without EA were performed. Continuous parametric data, nonparametric data, and categorical data were analyzed using unpaired t-tests, the Mann-Whitney U test, and Fisher exact test, respectively. All parameters were evaluated for multicollinearity before logistic regression analysis. Multicollinearity was examined by the Pearson correlation or Spearman correlation test. To avoid multicollinearity, if a correlation was detected (r > 0.7), only a single variable was included in the logistic regression model. For calculation of the correlation coefficient, operative procedures were grouped as higher risk of developing EA (tonsillectomy and strabismus surgery)1,3,10,21 and not higher risk. The optimal combination of predictors for predicting EA was determined from the final regression model. A forward/backward stepwise selection procedure was used to develop the final regression model. The model with the lowest value of AIC was selected at each step. To develop an easy-to-use scale, we converted the coefficient of each predictor in the final model to an integer. For this purpose, the coefficients were divided by the minimal coefficient and were rounded.
In phase 2, we used Obuchowski and McClish25 and Obuchowski et al’s26 method for sample size calculation. Assuming a 25% of incidence of EA, with an α-error of .05, and a power of 0.9,25,26 a sample size of 75 patients was required. The lower limit of the 95% confidence interval (CI) of the c-index (ie, 0.74) obtained in phase 1 was used to calculate the sample size. To allow for a 25% dropout rate, we recruited 100 patients.
The 95% CI of the c-index was calculated using DeLong et al’s27 method. The predictive performance of the EA risk scale was considered statistically significant when the 95% CI of the c-index excluded the value of 0.5. The predictive performance of the EA risk scale was considered good or excellent, when the c-index was greater than 0.7 or 0.8, respectively. The best cutoff point was determined by the Youden index.
Gray zone range was determined as the range between the 2 points where sensitivity and specificity each became 90%.
All statistical analyses were performed using R version 3.3.0 (R foundation for Statistical Computing, Vienna, Austria).
Development Phase (Phase 1)
The incidence of EA was 34.2% of all patients. The results of the univariable comparisons for 10 candidate predictors between patients with and without EA are shown in Table 1. Among the candidate predictors for predicting the incidence of EA, age, height, and weight were highly correlated. Age was selected as an independent variable to represent all 3 characteristics, considering multicollinearity. Similarly, operative procedure, presence or absence of nerve block, and the total dose of fentanyl were highly correlated. Operative procedure was selected as an independent variable.
In logistic regression analysis, the remaining 6 predictors (age, PAB score, operative procedure, anesthesia time, the method of stabilizing airway, sex) and one modulator (presence or absence of meridian stimulation) were used as independent variables, and the incidence of EA was used as a dependent variable. Using AIC stepwise selection, the 4 predictors (age, PAB score, anesthesia time, and operative procedure) were determined as the optimal combination for predicting EA (Table 2).
β-Coefficients were calculated for each selected predictor, and assigned an integer score (Table 2).
These data were used to develop the EA risk scale (Table 3). The score of the age domain is calculated as 9 minus the patient’s age in years. For the operative procedure domain, the score for strabismus surgery and tonsillectomy is 7, and the score for other operative procedures is 0. For the PAB domain, the score for patient screaming or shouting at the time of induction is 4; tearful and/or withdrawn, but compliant with induction is 2; and calm and controlled is 0. For the anesthesia time domain, the score for anesthesia over 2 hours is 4; from 1 to 2 hours is 2; and less than 1 hour is 0. The EA risk score is obtained by summing these scores, and EA risk scores ranged from 1 to 23 points.
The predictive ability of the EA risk scale was examined by generating an ROC curve (Figure; green line). The c-index was 0.84 (95% CI, 0.74–0.94). The Hosmer-Lemeshow goodness-of-fit test was nonsignificant (P = .97), indicating that the model exhibited a good fit.
Validation Phase (Phase 2)
One hundred patients were enrolled in this prospective observational study. All patients completed all procedures. Demographic characteristics of the patients and operative procedures in the validation phase are shown in Tables 4 and 5, respectively.
The incidence of EA was 39%. Figure shows the ROC curve assessing the performance of the EA risk scale for EA prediction in phase 1 (development cohort; green) and phase 2 (validation cohort; red). The c-index for the validation phase was 0.81 (95% CI, 0.72–0.89). The best cutoff point was 11 (sensitivity = 87%, and specificity = 61%). The gray zone of the validation phase ranged from 10 to 13 points, and included 38% of patients.
The c-index for predicting EA using the EA risk scale was similar in both phases, and the 95% CI overlapped (0.74–0.94 in the development phase and 0.72–0.89 in the validation phase, Figure).
Our study developed and validated the EA risk scale consisting of 4 domains: age, PAB score, operative procedure, and anesthesia time. In the validation cohort, the EA risk scale showed an excellent predictive performance (c-index > 0.8).
The EA risk scale was developed by weighting 4 risk factors in the final regression model. Although previous studies reported EA risk factors,3,7,13,28 no tool has been developed for predicting the risk of EA in patients who have several risk factors. For example, although the PAB scores and operative procedures have been reported as EA risk factors, the risk of EA between patients who had the same PAB score, but had different surgery, cannot be estimated. Our study identified independent risk factors for EA and developed the EA risk scale with weighted predictors, providing the tool to assess EA risk quantitatively for each patient with several risk factors.
The EA risk scale showed a high accuracy in 2 points. First, its c-index exceeded 0.8. Second, over 60% of all the patients were outside of the gray zone. For the patients outside of the gray zone, the sensitivity or specificity of predicting EA became over 90%. The EA risk scale can be used to devise strategies for preventing EA. For example, a prophylactic agent such as fentanyl or propofol could be administered at the end of surgery to prevent EA in high-risk patients. Although these prophylactic agents may increase the risk of prolonged recovery or respiratory depression, the benefit of preventing EA might balance the risk of side effects. The best cutoff point was determined to maximize total predictive accuracy (ie, sensitivity plus specificity). However, the lower point of the gray zone could be used as a cutoff point to minimize the number of false-negative results, which may be more suitable for clinical settings. In addition, the EA risk scale could contribute to the future clinical studies. The EA risk scale can be used as a stratification factor for cohort studies or as inclusion or exclusion criteria for clinical trials.
Consistent with the previous studies, age, preoperative behavior, and operative procedure were included in the EA risk scale.3,10,11,28 Although the modified Yale Preoperative Anxiety Scale is used to evaluate children’s preoperative anxiety in trial setting, it would not be feasible in busy operating room settings because it required evaluation of 5 items at 4 different preoperative time points. Therefore, the PAB scale was used to assess the patients’ preoperative behavior because it can be easily used in daily clinical practice.20
Anesthesia time has not previously been reported as a risk factor of EA. In regard to strabismus surgery, EA risk was reported to be higher in the complicated procedures, when compared with simpler procedures.21 In our study, anesthesia time was selected as an independent predictor, maybe because it reflects the difficulty and complexity of the procedures.
The EA risk scale, or the prediction model, did not include the factor that directly predicted postoperative pain. Because pain is known to be a causative factor of EA, factors predicting postoperative pain should be included in the model. The candidate predictors of postoperative pain in our data were operative procedure, presence or absence of a nerve block, and total dose of fentanyl. However, these 3 factors were highly correlated with each other, and thus, presence or absence of a nerve block and dose of fentanyl were excluded in the initial model to avoid multicollinearity. Consequently, operative procedure was included in our model. If other appropriate factors predicting postoperative pain are available, these variables should be included for the development of a better predictive model.
This study has several strengths. First, because data were collected prospectively in both phases, PAED scores and PAB scores are complete data sets. Second, the EA risk scale was validated on a different external cohort. Validation of the predictive scale using an external cohort is necessary before using the scale in clinical practice.29–31
This study has several limitations. First, this was a single-center study. Although the validity of the EA risk scale was confirmed on external cohort, the “internal validity” of the scale was confirmed; the cohort used in both development and validation phase was from the same hospital. The “external validity,” or the validity of the EA risk scale for patients in another hospital, should be confirmed. Moreover, the score for anesthesia time might be different for each hospital because operative time varies between hospitals, which is one of the reasons for the need to examine external validity. Second, all patients were anesthetized with sevoflurane; therefore, the EA risk scale cannot be applied to patients anesthetized with other anesthesia techniques such as total intravenous anesthesia with propofol. However, the risk of EA with propofol is low compared with sevoflurane anesthesia.3,7,28,32 In addition, attention should be paid to our inclusion criteria, ie, patients with American Society of Anesthesiologists physical classification I or II, patients without mental retardation, patients undergoing minor surgery, and patients without premedication. The EA risk scale can be applied in these specific populations, and thus, further studies are needed to confirm the results in other populations. Third, the cause of EA could not be determined. The negative behavior of children emerging from anesthesia, such as disorientation, tantrums, or agitation, is referred to as EA. The major causes of EA were postoperative delirium (ie, acute confusional state) or pain, but the PAED scale we used could not distinguish the cause of EA. Therefore, the EA risk scale could only be used for predicting agitation, it could not predict delirium. Fourth, the sample size of our study was small. In phase 1, a total of 41 events were observed among the 120 patients, whereas the initial logistic regression model included 7 variables. Thus, events per variable were 5.9 (41/7) in the initial logistic regression model. A simulation study33 revealed that stepwise selection might induce a substantial bias when events per variable are low (<10–20). Therefore, the coefficients of the model, which were used to determine the score of each predictor, could be biased. Our results should be confirmed by research using larger sample sizes. Fifth, there is no time component of EA at all in our study. Because we used data obtained from our previous study, we did not have information regarding the duration of agitation. Thus, this variable could not be analyzed. Severity and duration of agitation may be important when the decision is made to either treat EA or not. Therefore, further studies to predict duration of agitation are needed.
In summary, we developed and validated the EA risk scale. This scale can predict EA after general anesthesia in children with over 90% accuracy in approximately 60% of the patients. The EA risk scale could be used to predict EA in children and adopt a preventive strategy for those at high-risk. This score-based preventive approach should be studied prospectively to assess the safety and efficacy of such a strategy.
Name: Maai Hino, MD.
Contribution: This author helped conduct the study and write the manuscript.
Name: Takahiro Mihara, MD, PhD.
Contribution: This author helped design and conduct the study, analyze the data, and write the manuscript.
Name: Saeko Miyazaki, MD.
Contribution: This author helped conduct the study.
Name: Toshiyuki Hijikata, MD.
Contribution: This author helped conduct the study and write the manuscript.
Name: Takaaki Miwa, MD.
Contribution: This author helped write the manuscript.
Name: Takahisa Goto, MD.
Contribution: This author helped write the manuscript.
Name: Koui Ka, MD.
Contribution: This author helped write the manuscript.
This manuscript was handled by: James A. DiNardo, MD, FAAP.
1. Eckenhoff JE, Kneale DH, Dripps RD. The incidence and etiology of postanesthetic excitement. A clinical survey. Anesthesiology. 1961;22:667–673.
2. Holzki J, Kretz FJ. Changing aspects of sevoflurane in paediatric anaesthesia: 1975-99. Paediatr Anaesth. 1999;9:283–286.
3. Vlajkovic GP, Sindjelic RP. Emergence delirium in children: many questions, few answers. Anesth Analg. 2007;104:84–91.
4. Dahmani S, Stany I, Brasher C, et alPharmacological prevention of sevoflurane- and desflurane-related emergence agitation in children: a meta-analysis of published studies. Br J Anesth. 2010;104:216–223.
5. Zhang C, Li J, Zhao D, Wang Y. Prophylactic midazolam and clonidine for emergence from agitation in children after emergence from sevoflurane anesthesia: a meta-analysis. Clin Ther. 2013;35:1622–1631.
6. Kanaya A, Kuratani N, Satoh D, Kurosawa S. Lower incidence of emergence agitation in children after propofol anesthesia compared with sevoflurane: a meta-analysis of randomized controlled trials. J Anesth. 2014;28:4–11.
7. Uezono S, Goto T, Terui K, et alEmergence agitation after sevoflurane versus propofol in pediatric patients. Anesth Analg. 2000;91:563–566.
8. Kuratani N, Oi Y. Greater incidence of emergence agitation in children after sevoflurane anesthesia as compared with halothane: a meta-analysis of randomized controlled trials. Anesthesiology. 2008;109:225–232.
9. Cravero J, Surgenor S, Whalen K. Emergence agitation in paediatric patients after sevoflurane anaesthesia and no surgery: a comparison with halothane. Paediatr Anaesth. 2000;10:419–424.
10. Voepel-Lewis T, Malviya S, Tait AR. A prospective cohort study of emergence agitation in the pediatric postanesthesia care unit. Anesth Analg. 2003;96:1625–1630.
11. Kain ZN, Caldwell-Andrews AA, Maranets I, et alPreoperative anxiety and emergence delirium and postoperative maladaptive behaviors. Anesth Analg. 2004;99:1648–1654.
12. Saringcarinkul A, Manchupong S, Punjasawadwong Y. Incidence and risk factors of emergence agitation in pediatric patients after general anesthesia. J Med Assoc Thai. 2008;91:1226–1231.
13. Lynch EP, Lazor MA, Gellis JE, Orav J, Goldman L, Marcantonio ER. The impact of postoperative pain on the development of postoperative delirium. Anesth Analg. 1998;86:781–785.
14. Przybylo HJ, Martini DR, Mazurek AJ, Bracey E, Johnsen L, Coté CJ. Assessing behaviour in children emerging from anaesthesia: can we apply psychiatric diagnostic techniques? Paediatr Anaesth. 2003;13:609–616.
15. Cohen IT, Finkel JC, Hannallah RS, Hummer KA, Patel KM. Rapid emergence does not explain agitation following sevoflurane anaesthesia in infants and children: a comparison with propofol. Paediatr Anaesth. 2003;13:63–67.
16. Aouad MT, Kanazi GE, Siddik-Sayyid SM, Gerges FJ, Rizk LB, Baraka AS. Preoperative caudal block prevents emergence agitation in children following sevoflurane anesthesia. Acta Anaesthesiol Scand. 2005;49:300–304.
17. Hijikata T, Mihara T, Nakamura N, Miwa T, Ka K, Goto T. Electrical stimulation of the heart 7 acupuncture site for preventing emergence agitation in children: A randomised controlled trial. Eur J Anaesthesiol. 2016;33:535–542.
18. Köner O, Türe H, Mercan A, Menda F, Sözübir S. Effects of hydroxyzine-midazolam premedication on sevoflurane-induced paediatric emergence agitation: a prospective randomised clinical trial. Eur J Anaesthesiol. 2011;28:640–645.
19. Bajwa SA, Costi D, Cyna AM. A comparison of emergence delirium scales following general anesthesia in children. Paediatr Anaesth. 2010;20:704–711.
20. Beringer RM, Greenwood R, Kilpatrick N. Development and validation of the Pediatric Anesthesia Behavior score – an objective measure of behavior during induction of anesthesia. Paediatr Anaesth. 2014;24:196–200.
21. Joo J, Lee S, Lee Y. Emergence delirium is related to the invasiveness of strabismus surgery in preschool-age children. J Int Med Res. 2014;42:1311–1322.
22. Akaike H. A new look at the statistical model identification. Autom Control IEEE Trans. 1974;19:716–723.
23. Harrell FE Jr, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15:361–387.
24. Coste J, Pouchot J. A grey zone for quantitative diagnostic and screening tests. Int J Epidemiol. 2003;32:304–313.
25. Obuchowski NA, McClish DK. Sample size determination for diagnostic accuracy studies involving binormal ROC curve indices. Stat Med. 1997;16:1529–1542.
26. Obuchowski NA, Lieber ML, Wians FH Jr.. ROC curves in clinical chemistry: uses, misuses, and possible solutions. Clin Chem. 2004;50:1118–1125.
27. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837–845.
28. Dahmani S, Mantz J, Veyckemans F. Case scenario: severe emergence agitation after myringotomy in a 3-yr-old child. Anesthesiology. 2012;117:399–406.
29. Altman DG, Royston P. What do we mean by validating a prognostic model? Stat Med. 2000;19:453–473.
30. Collins GS, de Groot JA, Dutton S, et alExternal validation of multivariable prediction models: a systematic review of methodological conduct and reporting. BMC Med Res Methodol. 2014;14:40.
31. Poole D, Carlisle JB. Mirror, mirror on the wall…predictions in anesthesia and critical care. Anesthesia. 2016;71:1104–1109.
32. Chandler JR, Myers D, Mehta D, et alEmergence delirium in children: a randomized trial to compare total intravenous anesthesia with propofol and remifentanil to inhalational sevoflurane anesthesia. Paediatr Anaesth. 2013;23:309–315.
© 2017 International Anesthesia Research Society
33. Steyerberg EW, Eijkemans MJ, Habbema JD. Stepwise selection in small data sets: a simulation study of bias in logistic regression analysis. J Clin Epidemiol. 1999;52:935–942.