Skip Navigation LinksHome > June 2014 - Volume 120 - Issue 6 > Comparative Analysis of Outcome Measures Used in Examining N...
doi: 10.1097/ALN.0000000000000248
Perioperative Medicine: Clinical Science

Comparative Analysis of Outcome Measures Used in Examining Neurodevelopmental Effects of Early Childhood Anesthesia Exposure

Ing, Caleb H. M.D., M.S.; DiMaggio, Charles J. Ph.D., M.P.H., P.A.-C.; Malacova, Eva Ph.D.; Whitehouse, Andrew J. Ph.D.; Hegarty, Mary K. M.B.B.S., F.A.N.Z.C.A.; Feng, Tianshu M.S.; Brady, Joanne E. M.S.; von Ungern-Sternberg, Britta S. M.D., Ph.D.; Davidson, Andrew J. M.D.; Wall, Melanie M. Ph.D.; Wood, Alastair J. J. M.D.; Li, Guohua M.D., Dr.P.H.; Sun, Lena S. M.D.

Free Access
Article Outline
Collapse Box

Author Information

Collapse Box


Introduction: Immature animals exposed to anesthesia display apoptotic neurodegeneration and neurobehavioral deficits. The safety of anesthetic agents in children has been evaluated using a variety of neurodevelopmental outcome measures with varied results.
Methods: The authors used data from the Western Australian Pregnancy Cohort (Raine) Study to examine the association between exposure to anesthesia in children younger than 3 yr of age and three types of outcomes at age of 10 yr: neuropsychological testing, International Classification of Diseases, 9th Revision, Clinical Modification–coded clinical disorders, and academic achievement. The authors’ primary analysis was restricted to children with data for all outcomes and covariates from the total cohort of 2,868 children born from 1989 to 1992. The authors used a modified multivariable Poisson regression model to determine the adjusted association of anesthesia exposure with outcomes.
Results: Of 781 children studied, 112 had anesthesia exposure. The incidence of deficit ranged from 5.1 to 7.8% in neuropsychological tests, 14.6 to 29.5% in International Classification of Diseases, 9th Revision, Clinical Modification–coded outcomes, and 4.2 to 11.8% in academic achievement tests. Compared with unexposed peers, exposed children had an increased risk of deficit in neuropsychological language assessments (Clinical Evaluation of Language Fundamentals Total Score: adjusted risk ratio, 2.47; 95% CI, 1.41 to 4.33, Clinical Evaluation of Language Fundamentals Receptive Language Score: adjusted risk ratio, 2.23; 95% CI, 1.19 to 4.18, and Clinical Evaluation of Language Fundamentals Expressive Language Score: adjusted risk ratio, 2.00; 95% CI, 1.08 to 3.68) and International Classification of Diseases, 9th Revision, Clinical Modification–coded language and cognitive disorders (adjusted risk ratio, 1.57; 95% CI, 1.18 to 2.10), but not academic achievement scores.
Conclusions: When assessing cognition in children with early exposure to anesthesia, the results may depend on the outcome measure used. Neuropsychological and International Classification of Diseases, 9th Revision, Clinical Modification–coded clinical outcomes showed an increased risk of deficit in exposed children compared with that in unexposed children, whereas academic achievement scores did not. This may explain some of the variation in the literature and underscores the importance of the outcome measures when interpreting studies of cognitive function.
Back to Top | Article Outline

What We Already Know about This Topic

* The results of retrospective analyses of the impact of anesthesia during childhood on later cognitive function are variable, with some studies indicating deficits associated with anesthesia whereas others show no association
* A variety of outcome measures have been employed for cognitive analysis
Back to Top | Article Outline

What This Article Tells Us That Is New

* Of the three outcome measures used, neuropsychological testing and International Classification of Diseases, 9th Revision, Clinical Modification–coded clinical outcomes found deficits associated with anesthesia exposure in children while academic achievement tests did not
* The variation of the results in published studies assessing the association between anesthetic exposure and cognitive deficits may be dependent upon the outcome measure used
IN animal models, exposure of developing brains to N-methyl-D-aspartate antagonists (such as nitrous oxide and ketamine) and γ-amino butyric acid agonists (such as benzodiazepines, propofol, and volatile anesthetics) leads to dose-dependent neuroapoptosis and neurodegenerative changes.1–7 In addition, long-term neurocognitive changes in learning, memory, motor activity, attention, and behavior are observed during adulthood after early postnatal anesthesia exposure.3,8–10
Clinical studies of neurodevelopmental outcome after childhood anesthesia exposure have reported mixed results.11–20 Given the lack of an obvious phenotype or a definitive standard to measure cognitive deficit in this population, the outcome measures used in these clinical studies are varied, including International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9) codes, academic achievement test scores, teacher ratings, learning disability measures, and direct neuropsychological testing. The published results based on these outcomes have found both an increased risk in cognitive impairment associated with anesthetic exposure or no association. The use of different outcome measures makes the interpretation of these studies difficult and may have contributed to the divergent findings. No study has specifically compared the ability of different types of outcome measures to assess a single group of children exposed to anesthesia.
Using a prospective birth cohort from the Western Australian Pregnancy Cohort (Raine) Study, we recently reported an association between exposure to anesthesia during the first 3 yr of life and deficits in language and abstract reasoning at age of 10 yr, identified by direct neuropsychological testing.21 The Raine cohort provides the unique opportunity to compare three frequently used neurodevelopmental outcome measures: academic performance, ICD-9–coded diagnoses, and test scores from direct neuropsychological testing. The purpose of this study is to further extend our work with the Raine cohort to (1) evaluate the ability of different outcome measures to determine cognitive differences in children exposed to anesthesia and to (2) determine the risk of impairment estimated by all outcome measures after accounting for demographic, perinatal, and medical illness covariates.
Back to Top | Article Outline

Materials and Methods

The study protocol was approved by the Institutional Review Board at Columbia University (New York, New York) as exempt from written or informed consent.
Back to Top | Article Outline
Data Source
Figure. No caption a...
Image Tools
We obtained data from the Western Australian Pregnancy Cohort (Raine) Study, an established birth cohort consisting of 2,868 children born from 1989 to 1992, originally created to evaluate the long-term effects of prenatal ultrasound. The Raine Study enrolled 2,900 pregnant women at 16 to 20 weeks gestation from the major tertiary maternity hospital and nearby private practice medical centers in Perth, Western Australia. Mothers were selected for enrollment if they had sufficient proficiency in English, expected to deliver at the hospital, and intended to remain in Western Australia for follow-up.22 The Raine Study collected detailed demographic and medical data prenatally and at birth from medical records and parental self report. After birth, all children were assessed at 1, 2, 3, 5, 8, 10, 14, 17, and 20 yr of age. Parents were asked to keep detailed diaries of their child’s medical history. During follow-up visits, parents filled out questionnaires describing illnesses and medical problems, which were coded into ICD-9 codes by a research nurse (appendix). Coding was performed under the supervision of a physician as soon as possible after the visit. The most commonly encountered codes were printed in a coding guide, and an ICD-9 reference manual was used to code all medical conditions not found in the coding guide. The research nurse addressed any ambiguities by directly contacting the parents for more specific details regarding the medical conditions. During the analysis, ICD-9 codes found to include additional trailing zeros were truncated during analysis. Rarely, the fifth digit of a code could not be mapped to known ICD-9 codes and the code was truncated to a four-digit code. As there was no direct access to medical records after the perinatal period, including surgical and anesthetic records, the ICD-9 codes were used to identify surgical procedures and medical diagnoses. We classified any child who had a surgical or diagnostic procedure requiring anesthesia before the age of 3 yr as “exposed,” and the rest as “unexposed.” Children who missed all three scheduled follow-up visits from 1 to 3 yr old were considered “missing” and excluded from further analysis as data on exposure to anesthesia were not available for them. Demographic information for missing children was previously evaluated.21 To ensure exposure to anesthesia, we reviewed the types of procedures, all of which were performed after leaving the maternity hospital. Children who were found to have diagnostic procedures not requiring anesthesia were placed in the unexposed group.
Back to Top | Article Outline
Directly Administered Neuropsychological Tests
According to Raine Study protocol, direct neuropsychological testing was performed at specific follow-up visits with the most extensive testing occurring at the 10-yr follow-up visit. A total of six tests were performed at age of 10 yr, including assessments of language, cognitive function, motor skills, and behavior. Only neuropsychological tests previously demonstrated to be associated with impairment after anesthesia exposure were included in the analyses.21 These were the Raven’s Colored Progressive Matrices, which specifically measured global cognitive performance, nonverbal intelligence, and visuospatial functions, and the Clinical Evaluation of Language Fundamentals (CELF), a language test that assesses higher-order semantic, grammatical, and verbal memory abilities.23,24 In addition to an overall total score (CELF-T), the CELF test generated two subscores with the receptive language score (CELF-R) measuring listening comprehension and the expressive language score (CELF-E) tracking speaking ability. As in our previous analysis, impairment was defined as children with scores worse than 1.5 SDs than the mean of the entire cohort.25 These score cutoffs were found to be similar to those normed for American children with the exception of slightly lower CELF-E scores in our Australian cohort. A cutoff of 1.5 SD was chosen to apply a consistent scale for all assessments, which in previously published works have had clinical impairment defined at various levels including 1, 1.5, or 2 SD from the mean.25–29
Back to Top | Article Outline
ICD-9–Coded Mental, Behavioral, and Neurodevelopmental Disorders
International Classification of Diseases, 9th Revision, Clinical Modification–coded diagnoses were based on clinician-diagnosed disorders reported by parents during follow-up visits. Mental, behavioral, and neurodevelopmental disorders measured by ICD-9 codes were divided into behavioral disorder (ICD9-B), language and cognitive disorders (ICD9-L/C), or either behavioral or language and cognitive disorders (ICD9-B/L/C). The diagnoses for behavioral disorder included autism, psychological disorders, and attention deficit hyperactivity disorder. The diagnoses for language and cognitive disorder included reading, language, and arithmetical disorders, developmental delay, and mental retardation. A disorder was classified as the presence of an ICD-9–coded clinical disorder from any follow-up visit up to and including the 10-yr follow-up. Individual records were reviewed and children with a diagnosis of a developmental or behavioral disorder preceding anesthesia exposure were excluded from analysis.
Back to Top | Article Outline
Academic Achievement Scores
Academic achievement assessment was based on Western Australian Literacy and Numeracy Assessment standardized test scores. The Western Australian Literacy and Numeracy Assessment is a statewide test used to compare children’s performance in literacy and numeracy between Australian States and Territories and is comparable with other Australian statewide assessment programs. Children are tested in grades 3, 5, and 7 with approximately 75% of Western Australian Children assessed.30 Approximately 23% of children in Western Australia went to private, nongovernment schools and as a result were not tested. These children may come from a higher socioeconomic status than the children who were tested. The remaining untested children may have been exempt from the test due to intellectual impairment, lack of competency in English, or absence during the testing period. Test results were obtained from the Western Australian Department of Education and Training for children whose parents agreed to sign a separate consent for the release of the scores. All scores included in this study were from children tested in the fifth grade approximately at age of 10 yr. The Western Australian Literacy and Numeracy Assessment is composed of four separate Western Australian Monitoring Standards in Education (WAMSE) tests, including assessments in numeracy (NWAMSE), reading (RWAMSE), spelling (SWAMSE), and writing (WWAMSE). Failing the minimum achievement standards for each test was classified as a test score below the nationally agreed upon benchmark score, developed between Commonwealth, State, and Territory Ministers for Education.
Back to Top | Article Outline
Comorbid Illness
The level of comorbid illness was assessed and quantified using the Johns Hopkins Adjusted Clinical Groups Case-Mix System (Baltimore, Maryland), which is a system developed by the Johns Hopkins Bloomberg School of Public Health to describe the health status of a population that can be applied to predict past and future healthcare utilization and costs.31 ICD-9 codes were used to calculate Resource Utilization Band (RUB) scores based on the expected levels of resources used by each child. To assess the level of illness up to the time of cognitive assessment at age of 10 yr, ICD-9 codes from all follow-up visits up to and including age 10 yr were used to calculate the RUB score. ICD-9 codes for mental, behavioral, and neurodevelopmental disorders were excluded from the calculation of RUB scores as they are used as outcome variables. For each child, resource utilization was coded as follows: 0, No diagnoses; 1, Healthy; 2, Low; 3, Moderate; 4, High; and 5, Very High. Children with no diagnoses and verified to have presented for follow-up were collapsed into the healthy category and children in the high and very high utilization groups were also combined. Coding of the RUB score was performed with the Johns Hopkins Adjusted Clinical Groups version 10.0.1 (The Johns Hopkins University, Baltimore, MD).
Back to Top | Article Outline
Statistical Analysis
To directly compare the different outcome measures in a single group of children, a complete case analysis was performed. In this analysis, all children in whom any of the outcomes or covariates was missing were excluded. The outcomes included all neuropsychological, ICD-9–coded, and academic achievement assessments, whereas the covariates included sex, low birth weight (<2,500 g), race, income, and maternal education. This restricted cohort included 781 children with complete data. Chi-square tests were used to assess for an increased likelihood of deficit in exposed compared with unexposed children. Risk ratios and 95% CIs were calculated to determine the strength of the association of deficit with exposure. A modified multivariable Poisson regression model with robust variance was used to adjust for socioeconomic and baseline perinatal health status covariates and generate an adjusted risk ratio (aRR). Dichotomized outcomes were subsequently evaluated with tetrachoric correlation to determine whether tests were concordant for measuring neurocognitive deficits. Summary outcomes (CELF-T and ICD9-B/L/C) were excluded from correlation analysis, as high correlation with their component outcomes was expected. High positive correlation was defined as 0.7 or greater, moderate correlation as less than 0.7 and 0.4 or greater, and poor correlation as less than 0.4.
To reinforce our primary findings, we confirmed our results and also tested for differences between associations of exposure with each of the outcomes using a repeated-measures approach.32–34 Although it is common to perform separate regressions for multiple outcomes, in doing so, the potential correlation between the outcomes is ignored and thus may lead to biased inference (i.e., incorrect standard error estimation). The repeated-measures approach accounts for correlation between outcomes, thus providing proper inference, and also allows for the testing of differences between the estimated coefficients (i.e., associations between anesthetic exposure and each outcome). The use of this modeling can directly test whether any of the cognitive outcomes are more (or less) strongly associated with anesthetic exposure. A multivariate outcome generalized estimating equation Poisson regression was used to calculate risk ratios between outcomes. All cognitive outcomes were coded as repeated measurements within each child. This approach takes into consideration the likely correlation between the different outcomes, while still allowing for the evaluation of outcome-specific effects (i.e., estimated regression coefficients—log-relative risks—for each outcome). By forming the contrast of the difference in the regression coefficients, a direct comparison of the strength of the association between exposure and each of the outcomes was made.32 A P value less than 0.05 indicated that the ability of one outcome to measure an association with exposure in this cohort was significantly different from another.
However, in complete case analysis, two problems can arise. If the children with missing values are systematically different from the completely observed children, the complete case analysis may be biased. In addition, because we are including a large variety of covariates and outcomes, the number of complete cases is relatively few.35 To address these issues, the robustness of our findings was assessed by repeating our analysis in the full study cohort of 2,606 children using all available data on individual outcome measures. Children with missing covariates were dropped from the multivariable analysis. This differed from our previous article, where missing covariates were coded as a separate level of each categorical variable. The results from the neuropsychological outcome measures in the analysis of the full cohort have been previously published and are used as a basis of comparison with ICD-9–coded disorder and achievement test outcomes. However, we further expanded on our previous results by incorporating a comorbidity-adjustment covariate in our regression model using data not available for our previous article. As a sensitivity analysis, all academic achievement test scores were also assessed as continuous variables using t tests to determine the existence of mean score differences between exposed and unexposed children. A P value less than 0.05 was considered significant. All statistical analyses were performed using SAS version 9.3 (SAS Institute Inc., Cary, NC).
Back to Top | Article Outline


Table 1
Table 1
Image Tools
Table 2
Table 2
Image Tools
The Raine cohort consists of 2,868 children, of which 260 children had no history of follow-up from ages 1 to 3 yr and were classified as “missing.” All ICD-9–coded mental, behavioral, and neurodevelopmental disorders in the cohort were identified (table 1). Of all recorded disorders, 60% were language and cognitive in nature and 40% were behavioral. Situations in which a physician identified a behavioral disorder that did not specifically fit a diagnostic condition at the time of assessment were coded as ICD-9 code 306.9: Unspecified Psychophysiological Malfunction. Two children were found to have an ICD-9–coded diagnosis for behavioral or developmental delay before anesthetic exposure and were also excluded. Of the remaining 2,606 children, a subset of 781 had complete covariate and outcomes data at age 10 yr, and 1,825 children had at least one covariate or outcome missing. Of the children with complete outcome and covariate data, 112 children were found to have had procedures requiring anesthesia before their third birthday and classified as “exposed,” whereas 669 did not have a history of exposure to anesthesia and were classified as “unexposed.” We noted that exposed children were similar to unexposed children except for a higher proportion of boys in the exposed group (table 2).
Back to Top | Article Outline
Association between Anesthesia Exposure and Deficit in Outcomes
Table 3
Table 3
Image Tools
Fig. 1
Fig. 1
Image Tools
In assessing the individual outcomes in the restricted cohort of 781 children, the incidence of deficit in each outcome was found to be range from 5.1 to 7.8% in the neuropsychological tests, 14.6 to 29.5% in the ICD-9–coded clinical outcomes, and 4.2 to 11.8% in the academic achievement tests. The incidence of ICD-9–coded language and cognitive disorder and deficit measured by all CELF language assessments was significantly higher in the exposed children (table 3). After adjustment for demographic and perinatal covariates, children exposed to anesthesia had an increased risk of language deficit measured by neuropsychological testing compared with their unexposed peers in all CELF assessments (CELF-T: aRR, 2.47; 95% CI, 1.41 to 4.33, CELF-R: aRR, 2.23; 95% CI, 1.19 to 4.18, and CELF-E: aRR, 2.00; 95% CI, 1.08 to 3.68), as well as increased risk of ICD-9–coded Behavior, Language, and Cognitive disorder (ICD9-B/L/C: aRR, 1.35; 95% CI, 1.05 to 1.75) and ICD-9–coded Language and Cognitive disorder (ICD9-L/C: aRR, 1.57; 95% CI, 1.18 to 2.10) (fig. 1). In contrast, the risk of not achieving national benchmarks in academic achievement for each WAMSE test was not significantly greater in children exposed to anesthesia. To evaluate consolidated academic achievement data from all four individual WAMSE tests, the association between anesthesia exposure and the presence of deficit in any of the individual WAMSE tests was also assessed. The risk of not achieving national benchmarks in academic achievement in any WAMSE test was also not significantly greater in children exposed to anesthesia (aRR, 1.27; 95% CI, 0.93 to 1.73).
Back to Top | Article Outline
Correlation between Outcomes
Table 4
Table 4
Image Tools
To determine whether tests were identifying deficit in the same children, a tetrachoric correlation analysis was performed (table 4). High correlation (≥0.7) was found between CELF-R and CELF-E, NWAMSE and RWAMSE, and WWAMSE and SWAMSE. The majority of the other outcomes showed moderate correlation (<0.7 and ≥0.4). ICD-9–coded behavioral disorders however specifically showed poor correlation (<0.4) with all other outcomes except for ICD-9–coded language and cognitive disorders. A subsequent analysis was performed to assess correlation of ICD-9–coded behavioral disorders and the presence of behavior and emotional problems assessed by the Child Behavior Checklist, a survey questionnaire evaluated in our previous work.21 We found that the presence of an ICD-9–coded behavioral disorder was moderately correlated with problems identified by all Child Behavior Checklist scores (0.54, 0.55, and 0.65 for tetrachoric correlation with internalizing, externalizing, and total behavior scores, respectively).36
Back to Top | Article Outline
Repeated-measures Analysis and Direct Comparison of Outcomes
Table 5
Table 5
Image Tools
Table 6
Table 6
Image Tools
As a confirmation of our primary objective, using a multivariate generalized estimating equation Poisson regression and fitting all outcomes in a single regression model, we determined that the significant associations between anesthesia exposure and deficit in each outcome did not change from what we found in the individual regression models. The differences in deficit between exposed and unexposed children remained in CELF-T (P < 0.01), CELF-R (P = 0.01), CELF-E (P = 0.03), ICD9-B/L/C (P = 0.02), and ICD9-L/C (P < 0.01), even after taking into account bias from correlated outcomes (table 5). By fitting all outcomes into the same model and directly comparing the regression coefficients, we determined that the ability of CELF to determine a cognitive difference in exposed children was significantly different from RWAMSE (vs. CELF-T: P = 0.03 vs. CELF-R: P = 0.04) and SWAMSE (vs. CELF-T: P = 0.01 vs. CELF-R: P = 0.04), but not NWAMSE and WWAMSE (table 6). We also found no statistically significant differences between the abilities of ICD-9–coded clinical outcomes and neuropsychological test scores or ICD-9–coded clinical outcomes and academic achievement.
Back to Top | Article Outline
Determining Risk of Deficit in Full Cohort with Comorbidity Adjustment
Fig. 2
Fig. 2
Image Tools
To test the robustness of the findings in our restricted cohort of children in the complete case analysis, we also assessed the full cohort of children. In this analysis, 197 children with missing covariate data were dropped from the multivariable regression model, leaving 305 children exposed to anesthesia and 2,104 unexposed children. After adjusting for demographic and neonatal covariates, a significant association was found between exposure and all measures of ICD-9–coded disorders (ICD9-B/L/C: aRR, 1.57; 95% CI, 1.35 to 1.83, ICD9-B: aRR, 1.46; 95% CI, 1.14 to 1.87, and ICD9-L/C: aRR, 1.75; 95% CI, 1.46 to 2.10) (fig. 2). No significant differences were found in academic achievement scores. Significant differences in neuropsychological test scores were found and had been reported in our previous article. To determine whether a time-course effect exists in the ICD-9–coded clinical diagnoses, we assessed the presence of a resolution of diagnoses over time by also evaluating ICD-9–coded disorders exclusively at the age of 10 yr follow-up. We found that at age 10 yr, anesthesia exposure was associated with an increased risk of disorder in ICD9-B/L/C (aRR, 1.42; 95% CI, 1.13 to 1.79) and ICD9-L/C (aRR, 1.66; 95% CI, 1.25 to 2.21), but not ICD9-B (aRR, 1.25; 95% CI, 0.91 to 1.73). As a sensitivity analysis, all academic achievement scores were further assessed as continuous variables using t tests to determine the existence of mean score differences between exposed and unexposed children. No significant differences were found in the mean scores of any of the achievement scores based on anesthesia exposure. Assessment of comorbidity using the Johns Hopkins Adjusted Clinical Groups system identified 192 (8.0%) children in the healthy group, 657 (27.3%) in the low-resource utilization group, 1,435 (59.6%) in the moderate-utilization group, and 125 (5.2%) in the high utilization group. After taking into account all covariates and RUB, significant associations were still found in language (CELF-T: aRR, 1.86; 95% CI, 1.19 to 2.91 and CELF-R: aRR, 1.69; 95% CI, 1.01 to 2.83), reasoning (Colored Progressive Matrices: aRR, 1.79; 95% CI, 1.17 to 2.73), and ICD-9–coded disorders (ICD9-B/L/C: aRR, 1.25; 95% CI, 1.07 to 1.47 and ICD9-L/C: aRR, 1.41; 95% CI, 1.17 to 1.70).
Back to Top | Article Outline


In this study, we found that when assessing cognition in children exposed to anesthesia at an early age, the results may depend on the outcome measures used. Specifically, neuropsychological testing and ICD-9–coded clinical outcomes were able to measure deficits at age 10 yr in children exposed to anesthesia before age 3 yr, whereas no differences could be identified using any academic achievement scores. Direct comparison of all outcomes found that the ability of some neuropsychological tests to determine a cognitive difference in exposed children was significantly better than that in academic achievement scores. We further assessed these outcomes in our complete cohort and calculated an adjusted risk of deficit after anesthesia exposure for each outcome after taking into account comorbid disease and found similar results.
The reasons for a lack of an identified difference in children exposed to anesthesia using academic achievement testing may be due to sparing of neurocognitive domains measured by the achievement tests or that these tests may have less sensitivity than neuropsychological testing or clinical measures. The moderate positive correlation between the majority of outcomes other than behavior however suggests that these measures are evaluating similar neurocognitive domains and that the issue of determining deficit lies with less sensitivity in the achievement tests.
To evaluate the robustness of our results and quantify the increased risk of impairment after anesthesia exposure, we evaluated all available scores for each outcome measure. After adjusting for demographic, perinatal, and comorbid illness covariates, an increased risk of deficit was found in language and abstract reasoning measured by neuropsychological testing as well as ICD-9–coded clinical outcome. The aRRs of deficit were higher in neuropsychological testing ranging from 1.69 to 1.86 compared with ICD-9–coded clinical neurodevelopmental disorders ranging from 1.25 to 1.41. These ranges of deficit are consistent with those reported by other investigators.12,15,16,18,19
To our knowledge, this is the first study to compare the ability of neuropsychological testing, ICD-9–coded disorders, and academic achievement in a single population to assess differences in children exposed and unexposed to anesthesia. These results indicate that ICD-9–coded neurodevelopmental disorders may be an effective measure of differences in exposed and unexposed children. This is consistent with results from the study by DiMaggio et al.,15,16 who also found an increased risk of ICD-9–coded behavioral and developmental delay in children exposed to anesthesia before the age 3 yr in a Medicaid dataset. A difference in the presence of clinically diagnosed disorders was also reported by Sprung et al.,19 who found an increased incidence of attention deficit hyperactivity disorder after multiple anesthetic exposures. Although we found differences in ICD-9–coded behavioral as well as language and cognitive deficit after adjusting for demographic covariates, differences in clinical behavior disorders were no longer significant at age 10 yr. The lack of a behavioral effect at age 10 yr is consistent with that reported in our previous work using Child Behavior Checklist as an outcome. In this cohort, behavioral abnormalities are also generally poorly correlated with other cognitive measures.
Our results suggest that academic achievement scores may lack the sensitivity to distinguish between exposed and unexposed children, but this may only be applicable to the WAMSE tests. Other achievement tests are likely to vary in sensitivity and may measure different cognitive domains than the ones used in this analysis. Academic achievement scores have been used in the past with mixed results. Block et al.20 found that children with no neurologic risk factors and exposed to anesthesia during infancy had similar mean scores on the Iowa Tests of Basic Skills and Educational Development at ages 7 to 10 yr, but a higher risk of scoring below the fifth percentile compared with that for the unexposed children. Bartels et al.13 however did not demonstrate a difference in Dutch elementary academic achievement test scores assessed at age 12 yr using a twin cohort. Hansen et al.17 did find an increased proportion of exposed children with test score nonattainment although were unable to demonstrate a difference in mean scores at ages 15 to 16 yr in a Danish standardized, nationwide test of academic achievement. This higher nonattainment rate may be suggestive of a developmentally disadvantaged group of children. Wilder et al.12 found an increased risk of learning disability (calculated with a formula using Wechsler IQ scales and Woodcock Johnson achievement tests) in children with multiple anesthetic exposures. Using the same cohort of children, Flick et al.18 found that children with multiple exposures to anesthesia had lower total cognitive scores on the California Achievement Test, but the difference was no longer significant after adjustment for covariates.
There are several limitations in our study. Among them are the observational nature of the study, differences in demographics between the exposed and unexposed children, the lack of detailed anesthetic information and medical records, and the attrition of the cohort over time. Although parents were required to keep detailed information about their children, and a trained research nurse performed all ICD-9 coding under the supervision of a physician, the reliance on parental report to communicate clinician-diagnosed disorders could result in miscommunication and therefore misspecification. As a consequence, the ICD-9–coded clinical outcomes used may differ from clinical diagnoses based on ICD-9 codes reported in administrative databases or medical record review and should be interpreted with caution. Part of the association of neurocognitive deficit with anesthesia may also be due to innate differences between children requiring surgery and diagnostic procedures, and those not requiring these procedures. A limitation of our previous work was the inability to account for comorbid disease, which we attempted to address in our current analysis by using RUB as a covariate in our regression model. After taking into account RUB in addition to all other covariates, a significant association remained between anesthetic exposure and CELF-T, CELF-R, Colored Progressive Matrices, and ICD9-B/L/C and ICD9-L/C, showing a 25 to 86% increase in deficit compared with unexposed children. Modeling comorbid illness however is complex and it is possible that RUB does not appropriately account for unmeasured differences in children requiring medical procedures. The lack of access to medical records limited our ability to review anesthetic exposure including the duration of anesthesia and specific drugs used as well as intraoperative adverse events. However, during the study period, standard monitors as recommended by the American Society of Anesthesiologists including pulse-oximetry and capnography were in use at the regional children’s hospital during all surgical procedures. Because the study period was during a time when the most prevalent volatile anesthetic was halothane, this may be the agent used in the majority of patients. Although halothane is no longer clinically available, it has been found to cause similar neurotoxic effects as other volatile anesthetics in the animal model.37,38 In addition, although anesthesia exposure occurred in all children requiring procedures, most also had surgery, and the effects of the inflammatory response of surgical stimulation is still unknown.
The most notable strength of this cohort is the variety of outcomes available in a single group of children. Although the inability of some of these measures to determine a difference may be due to the measurement of unaffected cognitive domains, a difference in test sensitivity may also be responsible. Neurodevelopmental studies of lead, pesticides, and other potential neurotoxins have similarly found that appropriate assessment tools are critical in documenting the effects of exposure.39,40 In neurotoxicology studies, sensitive outcomes are particularly important as an effect size of 0.2 SDs can be a clinically significant effect with important public health implications.39
The geographically isolated nature of Western Australia is likely to result in less migration than other parts of the world, but similar to any cohort study, we experienced loss to follow-up. In our overall cohort, children who were exposed to anesthesia and subsequently failed to follow-up at ages 1, 2, or 3 yr could be misclassified as unexposed children. This would likely bias the result toward the null, or a lack of a difference, resulting in an underestimation of the true difference between exposed and unexposed children. Approximately 23% of children in Western Australia attended private schools and as a result lacked academic achievement scores. However, the untested children are unlikely to influence our primary objective, which was to compare the three types of cognitive measures in children with testing in all outcomes.
Although neuropsychological tests seem to be the most sensitive measures, it is important to also consider other available outcomes. Prospective directly assessed neuropsychological testing is time consuming and expensive to perform and some datasets have clinical diagnoses and academic achievement scores readily available. In addition, although some neuropsychological tests may show a measurable difference, these differences may not be clinically or academically meaningful.
When assessing cognition in children with early exposure to anesthesia, results may depend on the type of outcome measure used. Although the published literature has presented contradictory conclusions, this may be due to the variability in the outcome measures used. Our results show that cognitive differences between children exposed and unexposed to anesthesia at age 3 yr were found at age 10 yr using neuropsychological and ICD-9–coded clinical outcomes, whereas achievement tests showed no difference. This is likely due to a lower test sensitivity in the academic achievement tests used, but important differences may exist between the achievement tests and the analysis in this work compared with previous studies.12,13,17,18,20 These results underscore the importance of the outcome measures when interpreting and designing studies assessing anesthetic exposure and cognitive function.
Back to Top | Article Outline


The authors thank the Raine study investigators and staff responsible for the collection of the data presented in this article. Sincere thanks are extended to all study families, as this research could not have been conducted without their participation. The authors also thank Peter D. Sly, M.B.B.S., F.R.A.C.P., M.D., D.Sc., Queensland Children’s Medical Research Institute (Brisbane, Australia), and Jenny Mountain, M.Clin.Epi., Raine Study Team, Telethon Institute for Child Health Research (Perth, Australia), for their help in acquiring the data for the article; Arthur Roh, M.S., Columbia University (New York, New York), for his help with the article figures; and Cynthia Salorio, Ph.D., Kennedy Krieger Institute (Baltimore, Maryland), for her help in the interpretation of neuropsychological test scores.
This work was supported in part by a grant from SmartTots (San Francisco, California). The Western Australian Pregnancy Cohort Study is funded by project and program grants from the National Health and Medical Research Council of Australia (Canberra, Australia). Core management funding is provided by the Raine Medical Research Foundation, the Telethon Institute for Child Health Research, the University of Western Australia (UWA), the UWA Faculty of Medicine, Dentistry, and Health Sciences, the Women and Infants Research Foundation, and Curtin University. All institutions providing core management funding reside in Perth, Australia. Dr. von Ungern-Sternberg is partly funded by the Princess Margaret Hospital Foundation and Woolworths Australia (Perth, Australia).
Back to Top | Article Outline

Competing Interests

The authors declare no competing interests.
Back to Top | Article Outline


1. Cattano D, Young C, Straiko MM, Olney JW. Subanesthetic doses of propofol induce neuroapoptosis in the infant mouse brain. Anesth Analg. 2008;106:1712–4

2. Young C, Jevtovic-Todorovic V, Qin YQ, Tenkova T, Wang H, Labruyere J, Olney JW. Potential of ketamine and midazolam, individually or in combination, to induce apoptotic neurodegeneration in the infant mouse brain. Br J Pharmacol. 2005;146:189–97

3. Jevtovic-Todorovic V, Hartman RE, Izumi Y, Benshoff ND, Dikranian K, Zorumski CF, Olney JW, Wozniak DF. Early exposure to common anesthetic agents causes widespread neurodegeneration in the developing rat brain and persistent learning deficits. J Neurosci. 2003;23:876–82

4. Slikker W Jr, Zou X, Hotchkiss CE, Divine RL, Sadovova N, Twaddle NC, Doerge DR, Scallet AC, Patterson TA, Hanig JP, Paule MG, Wang C. Ketamine-induced neuronal cell death in the perinatal rhesus monkey. Toxicol Sci. 2007;98:145–58

5. Zou X, Liu F, Zhang X, Patterson TA, Callicott R, Liu S, Hanig JP, Paule MG, Slikker W Jr, Wang C. Inhalation anesthetic-induced neuronal damage in the developing rhesus monkey. Neurotoxicol Teratol. 2011;33:592–7

6. Loepke AW, Soriano SG. An assessment of the effects of general anesthetics on developing brain structure and neurocognitive function. Anesth Analg. 2008;106:1681–707

7. Olney JW, Wozniak DF, Jevtovic-Todorovic V, Farber NB, Bittigau P, Ikonomidou C. Drug-induced apoptotic neurodegeneration in the developing brain. Brain Pathol. 2002;12:488–98

8. Satomoto M, Satoh Y, Terui K, Miyao H, Takishima K, Ito M, Imaki J. Neonatal exposure to sevoflurane induces abnormal social behaviors and deficits in fear conditioning in mice. ANESTHESIOLOGY. 2009;110:628–37

9. Bercker S, Bert B, Bittigau P, Felderhoff-Müser U, Bührer C, Ikonomidou C, Weise M, Kaisers UX, Kerner T. Neurodegeneration in newborn rats following propofol and sevoflurane anesthesia. Neurotox Res. 2009;16:140–7

10. Paule MG, Li M, Allen RR, Liu F, Zou X, Hotchkiss C, Hanig JP, Patterson TA, Slikker W Jr, Wang C. Ketamine anesthesia during the first week of life can cause long-lasting cognitive deficits in rhesus monkeys. Neurotoxicol Teratol. 2011;33:220–30

11. Sprung J, Flick RP, Wilder RT, Katusic SK, Pike TL, Dingli M, Gleich SJ, Schroeder DR, Barbaresi WJ, Hanson AC, Warner DO. Anesthesia for cesarean delivery and learning disabilities in a population-based birth cohort. ANESTHESIOLOGY. 2009;111:302–10

12. Wilder RT, Flick RP, Sprung J, Katusic SK, Barbaresi WJ, Mickelson C, Gleich SJ, Schroeder DR, Weaver AL, Warner DO. Early exposure to anesthesia and learning disabilities in a population-based birth cohort. ANESTHESIOLOGY. 2009;110:796–804

13. Bartels M, Althoff RR, Boomsma DI. Anesthesia and cognitive performance in children: No evidence for a causal relationship. Twin Res Hum Genet. 2009;12:246–53

14. Kalkman CJ, Peelen L, Moons KG, Veenhuizen M, Bruens M, Sinnema G, de Jong TP. Behavior and development in children and age at the time of first anesthetic exposure. ANESTHESIOLOGY. 2009;110:805–12

15. DiMaggio C, Sun LS, Kakavouli A, Byrne MW, Li G. A retrospective cohort study of the association of anesthesia and hernia repair surgery with behavioral and developmental disorders in young children. J Neurosurg Anesthesiol. 2009;21:286–91

16. DiMaggio C, Sun LS, Li G. Early childhood exposure to anesthesia and risk of developmental and behavioral disorders in a sibling birth cohort. Anesth Analg. 2011;113:1143–51

17. Hansen TG, Pedersen JK, Henneberg SW, Pedersen DA, Murray JC, Morton NS, Christensen K. Academic performance in adolescence after inguinal hernia repair in infancy: A nationwide cohort study. ANESTHESIOLOGY. 2011;114:1076–85

18. Flick RP, Katusic SK, Colligan RC, Wilder RT, Voigt RG, Olson MD, Sprung J, Weaver AL, Schroeder DR, Warner DO. Cognitive and behavioral outcomes after early exposure to anesthesia and surgery. Pediatrics. 2011;128:e1053–61

19. Sprung J, Flick RP, Katusic SK, Colligan RC, Barbaresi WJ, Bojanić K, Welch TL, Olson MD, Hanson AC, Schroeder DR, Wilder RT, Warner DO. Attention-deficit/hyperactivity disorder after early exposure to procedures requiring general anesthesia. Mayo Clin Proc. 2012;87:120–9

20. Block RI, Thomas JJ, Bayman EO, Choi JY, Kimble KK, Todd MM. Are anesthesia and surgery during infancy associated with altered academic performance during childhood? ANESTHESIOLOGY. 2012;117:494–503

21. Ing C, DiMaggio C, Whitehouse A, Hegarty MK, Brady J, von Ungern-Sternberg BS, Davidson A, Wood AJ, Li G, Sun LS. Long-term differences in language and cognitive function after childhood exposure to anesthesia. Pediatrics. 2012;130:e476–85

22. Newnham JP, Evans SF, Michael CA, Stanley FJ, Landau LI. Effects of frequent ultrasound during pregnancy: A randomised controlled trial. Lancet. 1993;342:887–91

23. Raven J, Court J, Raven J Manual for Raven’s Progressive Matricies and Vocabulary Scales-Section 2: Coloured Progressive Matrices. 1990 Oxford Oxford Psychologists Press

24. Semel E, Wiig E, Secord W Clinical Evaluation of Language Fundamentals. 19953rd edition San Antonio Psychological Corporation Harcourt Brace

25. Nadebaum C, Anderson VA, Vajda F, Reutens DC, Barton S, Wood AG. Language skills of school-aged children prenatally exposed to antiepileptic drugs. Neurology. 2011;76:719–26

26. Brantner S, Piek JP, Smith LM. Evaluation of the validity of the MAND in assessing motor impairment in young children. Rehabil Psychol. 2009;54:413–21

27. Luu TM, Ment LR, Schneider KC, Katz KH, Allan WC, Vohr BR. Lasting effects of preterm birth and neonatal brain hemorrhage at 12 years of age. Pediatrics. 2009;123:1037–44

28. Portaccio E, Goretti B, Lori S, Zipoli V, Centorrino S, Ghezzi A, Patti F, Bianchi V, Comi G, Trojano M, Amato MPMultiple Sclerosis Study Group of the Italian Neurological Society. . The brief neuropsychological battery for children: A screening tool for cognitive impairment in childhood and juvenile multiple sclerosis. Mult Scler. 2009;15:620–6

29. Pueyo R, Junqué C, Vendrell P, Narberhaus A, Segarra D. Raven’s Coloured Progressive Matrices as a measure of cognitive functioning in Cerebral Palsy. J Intellect Disabil Res. 2008;52(Pt 5):437–45

30. Oddy WH, Li J, Whitehouse AJ, Zubrick SR, Malacova E. Breastfeeding duration and academic achievement at 10 years. Pediatrics. 2011;127:e137–45

31. The Johns Hopkins ACG Case Mix System: Version 10.0 Release Notes. PC (DOS/WIN/NT) and Unix Version 10.0—December 2011. 2010 Baltimore Johns Hopkins Bloomberg School of Public Health

32. Fitzmaurice GM, Laird NM, Zahner GE, Daskalakis C. Bivariate logistic regression analysis of childhood psychopathology ratings using multiple informants. Am J Epidemiol. 1995;142:1194–203

33. Das A, Poole WK, Bada HS. A repeated measures approach for simultaneous modeling of multiple neurobehavioral outcomes in newborns exposed to cocaine in utero. Am J Epidemiol. 2004;159:891–9

34. Teixeira-Pinto A, Mauri L. Statistical analysis of noncommensurate multiple outcomes. Circ Cardiovasc Qual Outcomes. 2011;4:650–6

35. Gelman A, Hill J Data Analysis Using Regression and Multilevel/Hierarchical Models. 2007 Cambridge; New York Cambridge University Press

36. Achenbach T Manual for the Child Behavior Checklist/4–19 and 1991 Profile. 1991 Burlington University of Vermont Department of Psychiatry

37. Uemura E, Bowman RE. Effects of halothane on cerebral synaptic density. Exp Neurol. 1980;69:135–42

38. Uemura E, Levin ED, Bowman RE. Effects of halothane on synaptogenesis and learning behavior in rats. Exp Neurol. 1985;89:520–9

39. Bellinger DC. What is an adverse effect? A possible resolution of clinical and epidemiological perspectives on neurobehavioral toxicity. Environ Res. 2004;95:394–405

40. Bellinger DC, Stiles KM, Needleman HL. Low-level lead exposure, intelligence and academic achievement: A long-term follow-up study. Pediatrics. 1992;90:855–61

Back to Top | Article Outline

© 2014 American Society of Anesthesiologists, Inc.

Publication of an advertisement in Anesthesiology Online does not constitute endorsement by the American Society of Anesthesiologists, Inc. or Lippincott Williams & Wilkins, Inc. of the product or service being advertised.

Article Tools