Cognitive Recovery by Decade in Healthy 40- to 80-Year-Old Volunteers After Anesthesia Without Surgery : Anesthesia & Analgesia

Secondary Logo

Journal Logo

Original Research Articles: Original Clinical Research Report

Cognitive Recovery by Decade in Healthy 40- to 80-Year-Old Volunteers After Anesthesia Without Surgery

Baxter, Mark G. PhD*,†; Mincer, Joshua S. MD, PhD; Brallier, Jess W. MD; Schwartz, Arthur MD; Ahn, Helen MD; Nir, Tommer MD, PhD; McCormick, Patrick J. MD, MEng; Ismail, Mohammed BS; Sewell, Margaret PhD§; Allore, Heather G. PhD∥,¶; Ramsey, Christine M. PhD#,**; Sano, Mary PhD§,††; Deiner, Stacie G. MD†,‡‡,§§

Author Information
doi: 10.1213/ANE.0000000000005824
  • Free
  • CME Test
  • Continuing Medical Education



  • Question: Is time to recovery of baseline cognitive function after exposure to general anesthesia, without surgery, associated with age group?
  • Findings: Recovery of cognitive function to baseline was rapid and did not differ by age decade.
  • Meaning: General anesthesia alone is not associated with time to cognitive recovery in healthy adults of any age decade.

Postoperative neurocognitive disorders are the most common complications after surgery for older adults.1 The 2 major types are postoperative delirium (POD), seen in 10% to 60%, and postoperative cognitive dysfunction (POCD), experienced by 15% to 30%.2–6 The former is an acute attentional deficit, and the latter is a decline in cognitive ability relative to presurgery levels. The etiology for both is unknown. Earlier literature postulated that anesthetic agents might be neurotoxic, including the acceleration of the biochemistry of Alzheimer’s disease by common anesthetic agents.7 However, postoperative neurocognitive disorders are not related to the type of anesthetic drug or even the use of general versus regional anesthesia,8,9 suggesting that some aspect of surgery or illness, rather than general anesthesia, is the critical risk factor. However, it is difficult to identify the specific role of anesthesia in postoperative neurocognitive disorders because previous studies have not separated the need for surgery and the surgical procedure, which may be risk factors in their own right, from the anesthetic agent.

Information regarding cognitive recovery is important for a population of patients at risk for postoperative cognitive alteration who are frequently sent home on the day of surgery with instructions for self-care. The controversy over delayed recovery of cognitive function after anesthesia may impact the willingness of patients to undertake surgery, preventing them from accessing appropriate life-enhancing therapies. From a research and intervention perspective, distinguishing the effects of anesthesia from those of surgery can provide a basis for interpreting studies of postoperative cognition in surgical populations. However, there is limited information regarding cognition after anesthesia alone.

To address the specific role of anesthesia in the absence of surgery on cognition in older adults, who have been identified at higher risk of postoperative neurocognitive disorders, we assessed cognitive recovery after general anesthesia in healthy adult volunteers by decade from 40 to 80 years of age.10 The primary hypothesis is that the time to recovery increases across age decades from 40- to 80-year old based on the evidence that postoperative neurocognitive disorders are related to surgical stress and not anesthesia. Specifically, we examined whether there was an association between age group and time to recovery after adjusting for available confounding variables (gender, race, and education). We conducted this study in healthy, nonsurgical participants with normal cognition at baseline to focus on changes after anesthesia without surgery and without other confounds of frailty or preexisting cognitive impairment.


This study was approved by the institutional review board (IRB) of the Icahn School of Medicine at Mount Sinai ([email protected], 212-824-8200) and registered at (NCT02275026, date of registration October 27, 2014, principal investigator first Jeffrey Silverstein, MD, succeeded by Joshua Mincer, MD, PhD) before beginning participant recruitment. Full details of the study protocol are published, including the statistical analysis plan.10 Participants were recruited between February 2015 and April 2019 through local contacts and IRB-approved advertisements in local media and online. Potential participants were prescreened by telephone by both research staff and a study anesthesiologist. Informed written consent was obtained from participants at the first in-person visit. Specific inclusion criteria were adults 40 to 80 years old, American Society of Anesthesiologists (ASA) Physical Status 1 (no medical comorbidities) or 2 (1 or more medical comorbidities that do not impact the patient’s function), and no underlying cognitive dysfunction as determined from baseline cognitive testing before general anesthesia. Exclusion criteria included contraindication to magnetic resonance imaging (MRI) scanning (implanted metal, presence of tattoos, and claustrophobia), current smoking, use of illicit drugs, excessive use of alcohol, or other diseases that could affect response to anesthesia or alter brain physiology.10 Participants were excluded after recruitment and consent if the scan before anesthesia revealed any of the following: cerebral microvascular disease, any mass, evidence of old infarct (even without clinical signs), atrophy, and/or ventriculomegaly greater than expected for age in the neuroradiologist’s judgment. Age-appropriate changes, such as mild cortical atrophy, were not grounds for exclusion. Participants were also excluded if baseline neuropsychological testing suggested poor or abnormal baseline cognitive function in the judgment of the study neuropsychologist (M.S.).

Participants underwent a battery of cognitive tests before and at specific time points (detailed below) after exposure to 2 hours of general anesthesia. This duration of anesthesia was selected as it represents the duration of a wide range of surgical procedures. The primary outcome was recovery to baseline on the Postoperative Quality of Recovery Scale (PQRS) cognitive subtest, used for rapid assessment of short-term recovery.11 The PQRS cognitive subtest consists of 5 items: name, place, and date of birth; forward digit span testing; backward digit span testing; recalling a list of words; and generating words beginning with a specific letter. This test is scored as a binary “recovered/not recovered” outcome based on whether postanesthesia scores return to the preanesthesia level within a certain tolerance range.12 This test performs equivalently in psychometric terms when given in person or over the phone. Secondary outcomes were measures from in-depth neuropsychological testing that covered the domains of executive function and attention, episodic memory, language, processing speed, and working memory. Instruments for this testing were the National Institutes of Health (NIH) Toolbox Cognitive Battery13,14 and paper-and-pencil neuropsychological tests from the Alzheimer’s Disease Research Center Uniform Dataset Battery: Trail Making Test (parts A and B), Logical Memory (immediate and delayed recall), and Category Fluency15 as well as the California Verbal Learning Test (CVLT). For secondary measures, we used raw test scores on paper-pencil tests and fully adjusted T-scores (mean 50, standard deviation 10, adjusted for gender, age, level of education, and race/ethnicity) computed by the test software for each NIH Toolbox Cognitive Battery test and summed to create a composite score. We also analyzed the 7 subtests of the NIH Toolbox Cognitive Battery (dimensional change card sort, flanker inhibitory control and attention, picture sequence memory, list sorting working memory, oral reading recognition, picture vocabulary, and pattern comparison processing speed). Details of specific timing of test administration follow.

Testing Protocol and Anesthesia Exposure

Within a week before anesthesia, participants underwent baseline cognitive testing (PQRS, NIH Toolbox Cognitive Battery, and paper-and-pencil neuropsychological tests) and preanesthesia evaluation by an anesthesiologist. On the day of anesthesia, they first underwent a series of MRI scans while awake, including an MRI anatomical preanesthesia scan that was reviewed by a neuroradiologist for evidence of intracranial pathology as well as task-based and resting-state functional MRI (fMRI) scans.

After the completion of preanesthesia scanning, a 22-gauge intravenous (IV) line was placed. Following application of standard ASA monitors and preoxygenation, anesthesia was induced in the MRI suite with IV propofol at a weight and age-adjusted dose, after which a laryngeal mask (LM) was placed. Anesthesia was maintained with inhaled sevoflurane at an age-adjusted depth of 1 minimum alveolar concentration (MAC). A bispectral index (BIS) level of 40 to 60 was obtained after LM placement to aid in the assessment of anesthetic depth while the participant was positioned and secured for MRI scanning, during which time inhaled sevoflurane equilibrated. After equilibration, the participant was returned to the MRI bore. Multiple MRI scan sequences were performed over a ~2-hour period while general anesthesia was maintained. Ventilation was maintained to achieve a target end-tidal CO2 of 30 to 35 mm Hg. Mean arterial blood pressure was maintained within 20% of baseline with bolus administration with ephedrine (5 mg IV or 25 mg intramuscular [IM]) or phenylephrine (100 μg IV) as needed. No narcotics, benzodiazepines, or muscle relaxants were administered, so that any differences between age decades in recovery of cognitive function could be attributed to the general anesthetic drugs specifically. Ondansetron (4 mg IV) was given before emergence for antiemetic prophylaxis. The LM was removed at the end of the scan protocol in the MRI suite when the participant awakened.

Once the participant emerged (generally within 15 minutes), PQRS was performed. The participant was then returned to the MRI bore for scan acquisition, and approximately 1 hour after emergence from anesthesia, PQRS was repeated. Participants were brought to the postanesthesia care recovery unit where they were monitored until discharge. Each participant performed follow-up cognitive testing and MRI scanning at 1 day and 7 days later, as well as additional in-person cognitive testing at 30 days, including the PQRS and all secondary cognitive measures. Additionally, the PQRS was administered via telephone at 3 days after anesthesia. Thus, PQRS data were available at 6 time points after anesthesia (15 minutes, 60 minutes, 1 day, 3 days, 7 days, and 30 days) for assessment of recovery to baseline within 30 days of anesthesia exposure, and secondary measures (NIH Toolbox Cognitive Battery and paper-and-pencil tests) were given at 1 day, 7 days, and 30 days after anesthesia, in addition to the baseline testing. Although the focus of the study was on cognitive recovery over the 30-day postanesthesia, the PQRS was administered at 6 and 12 months postanesthesia as well. This article reports the full analyses of the cognitive primary and secondary end points for this study. Reports of neuroimaging findings will be published separately. We have already reported a pattern of anticorrelated resting-state fMRI activity in the early postanesthesia recovery period that did not vary by age decade.16

Statistical Analysis

Descriptive statistics by age decade were calculated. All analyses were adjusted for covariates: gender (male/female), level of education (<16 years/16 years or more), and race (White/non-White). We treated age decade as a categorical variable (40–49-year referent decade, 50–59 years, 60–69 years, and 70–80 years) for cognitive outcomes to allow for possible nonlinear (quadratic and cubic) associations with age decade. This decision was taken before data collection, although our power analysis conducted before beginning the study used a straightforward linear effect (see the “Sample Size Calculation” section). Discrete-time logit regression, adjusted for covariates, was used to test whether each age decade differed in the time to recovery relative to the youngest age decade (40–49) on the primary outcome (PQRS cognitive scale). Discrete-time logit regression can be applied when time is measured at a discrete (not continuous) time scale; thus, it accommodates multiple persons having the same apparent time the event occurs.10,17,18 All participants with baseline PQRS data were included. For analyses of the time course of the secondary cognitive outcome measures, we used linear mixed models (LMMs) with a spatial power covariance structure of repeated observations within participant, and person-specific random intercepts. These models assume that the data are missing at random; thus, participants with missing observations contribute to the model estimation. LMMs for secondary outcomes included the same covariates, baseline, and an interaction term for age decade by days postanesthesia. This interaction term allowed for comparison to the referent decade at each postanesthesia observation. Least-squares mean score differences of each age decade to the referent decade were estimated for each day postanesthesia. This allows for interpretation on the scale of each outcome. As a sensitivity analysis, we estimated discrete-time logit regression models, as for the primary outcome of the PQRS cognitive scale, on whether participants returned to their baseline score on each secondary measure. By treating age as categories based on decades, we were able to have nonlinear effects without having to add quadratic or cubic terms to the statistical models. Analyses were performed in SAS 9.4 (SAS Institute) with 2-sided tests.

Sample Size Calculation

Sample and power for primary and secondary outcomes were calculated with PASS.12,19 Returning to baseline PQRS over 30 days required 72 participants to detect a hazard ratio of 1.03 per year for age (β = 0.033), assuming a standard deviation for age of 11.2 years, 80% power, and a type I error of 0.05, and adjusting for other characteristics expected to have a generalized R2 of 0.2. This can be interpreted as approximately 3% slower recovery per year relative to 40-year-old participants. For secondary outcomes, although there were fewer time points, the effect-size calculation is similar. We powered this study based on 7 secondary cognitive outcome measures (6 paper-and-pencil tests and the NIH Cognitive Battery composite score). Assuming all 72 participants would return to baseline over 30 days, a Bonferroni-adjusted type 1 error rate of 0.05/7 = 0.0071, and other covariates having a generalized R2 of 0.2 with age, a hazard ratio of 1.05 (β = 0.05) could be detected with 80% power and a familywise type 1 error of 0.05 (after Bonferroni adjustment 0.0071 for each outcome). Pilot data used for study planning had 3 of 4 participants return to baseline at 15 minutes after anesthesia and all 4 by 7 days. Based on these calculations, planned enrollment (based on completion of the anesthesia session) was 19 participants per age decade (40–49, 50–59, 60–69, and 70–80 years), as 18 per age decade achieved adequate power and an additional participant per age decade was included in the event of a dropout.


We met enrollment targets, based on completion of the anesthesia session, in all decades except the 60- to 69-year-old participants. Recruitment lagged in this age decade because of difficulties in identifying prospective participants who met our inclusion criteria and were able to commit the time required to complete the study. Following consultation with the study Data and Safety Monitoring Board and the study biostatistician (H.A.) in April 2019, we closed enrollment for the study in May 2019 with 13 of 19 planned 60- to 69-year-old participants based on the fact that enrollment for the other 3 decades was already complete and recruitment of the remaining planned participants in the 60- to 69-year-old decade would have been highly unlikely to have changed the primary outcome. A CONSORT (Consolidated Standards of Reporting Trials) diagram for the entire study is presented in Figure 1, and participant demographics and baseline characteristics are presented in Table 1.

Table 1. - Demographics of Study Participants
Characteristics Age decade (y) number of participants (%) P value
40–49 50–59 60–69 70–80
Total enrolled 20 19 13 19
 Male 9 (45.0) 14 (73.7) 7 (53.8) 10 (52.6) .32
 Female 11 (55.0) 5 (26.3) 6 (46.2) 9 (47.4)
 Hispanic or Latino 5 (25.0) 3 (15.8) 1 (7.7) 1 (5.3) .30
 Not Hispanic or Latino 15 (75.0) 16 (84.2) 12 (92.3) 18 (94.7)
 Black or African American 12 (60.0) 11 (57.9) 3 (23.1) 5 (26.3) .014a
 White 6 (30.0) 7 (36.8) 9 (69.2) 14 (73.7)
 Other 2 (10.0) 1 (5.3) 1 (7.7) 0
 High school (12 y) 3 (15.0) 3 (15.8) 0 2 (10.5) .007b
 Some college (13–15 y) 4 (20.0) 11 (57.9) 2 (15.4) 5 (26.3)
 College (16 y) 8 (40.0) 2 (10.5) 8 (61.5) 6 (31.6)
 More than college (>16 y) 5 (25.0) 3 (15.8) 3 (23.1) 6 (31.6)
Age (y)
 Mean 45.5 54.9 64.9 73.4 <.005
 Median 45.57 54.95 64.1 72.6
 Standard deviation 2.7 2.9 3.3 3.4
 Minimum 40.1 50.8 60.0 70.0
 Maximum 49.9 59.8 69.5 80.7
American Society of Anesthesiologists status
 1 16 (80.0) 17 (89.5) 12 (92.3) 12 (63.2) .13
 2 4 (20.0) 2 (10.5) 1 (7.7) 7 (36.8)
Premorbid medical conditions
 0 16 (80.0) 18 (94.7) 9 (69.2) 13 (68.4) .18c
 1 3 (15.0) 1 (5.3) 3 (23.1) 4 (21.1)
 2 0 0 1 (7.7) 2 (10.5)
 >3 1 (5.0) 0 0 0
Number of medications on initial assessment
 0 18 (90.0) 18 (94.7) 9 (69.2) 15 (79.9) .19d
 1 1 (5.0) 0 2 (15.4) 1 (5.2)
 2 1 (5.0) 1 (5.3) 1 (7.7) 3 (15.8)
 >3 0 0 1 (7.7) 0
aWhite versus non-White.
b<16 y vs >=16 y.
c0 vs <0.
d0 vs >0.

Figure 1.:
CONSORT diagram. CONSORT indicates Consolidated Standards of Reporting Trials.

The paper-and-pencil and NIH Toolbox cognitive test composite scores showed similar preanesthesia baseline performance across age decades, confirming that even our oldest participants were cognitively healthy before anesthesia. The only test in which older participants performed significantly worse compared to 40- to 49-year-old participants at baseline was trails B. On average, the oldest decade took longer to complete this test, although not greater than national norms for their age.

Primary Outcome

Table 2. - Cumulative Number and Proportion of Participants Recovered on Postoperative Quality of Recovery Cognitive Scale at Each Study Time Point by Age Decade
Number/total number (%)
Age decade 15 min 60 min 1 d 3 d 7 d 30 d
40–49 3/19 (15.8) 8/19 (42.1) 17/19 (89.5) 18/19 (94.7) 18/19 (94.7) 18/19 (94.7)
50–59 5/19 (26.3) 11/19 (57.9) 18/19 (94.7) 19/19 (100) 19/19 (100) 19/19 (100)
60–69 1/13 (7.0) 8/13 (61.5) 12/13 (92.3) 12/13 (92.3) 12/13 (92.3) 12/13 (92.3)
70–80 3/18 (16.7) 9/18 (50.0) 16/18 (88.9) 17/18 (94.4) 17/18 (94.4) 18/18 (100)
Postoperative Quality of Recovery Cognitive Scale could not be determined for 2 participants (1 in the 40–49-y age decade and 1 in the 70–80-y age decade), bringing the total sample to N = 69.

Figure 2.:
Performance on the primary outcome (time to recover to baseline on the PQRS) and selected secondary cognitive measures adjusted for gender, education, and race and within-participant correlations over time. Adjustment for multiple comparisons was made with the Bonferroni procedure to preserve an overall 2-sided type 1 error rate at 0.05. The least-squares means and error bars indicating the associated 95% confidence intervals (1.96 times the standard error of the least-squares means) are shown. Higher scores reflect a better performance on CVLT and NIH Toolbox, and lower (faster) scores reflect a better performance on trails B. All age decades by time interactions were nonsignificant. CVLT indicates California Verbal Learning Test; NIH, National Institutes of Health; PQRS, Postoperative Quality of Recovery Scale.

Two participants had missing or incomplete baseline PQRS data: 1 (40–49-year-old decade) did not receive the PQRS at baseline because of a research coordinator error and 1 (70–80-year-old decade) refused 1 of the cognitive questions during the baseline and day 1 tests, so time to recover on PQRS could not be determined for these participants. There was no overall association of age decade with time to recovery to baseline on the PQRS cognitive scale (N = 69 cases with complete data): Wald χ2(3 df) = 1.83, P = .609. None of the age decades significantly differed relative to the 40- to 49-year-old participants; parameter estimates (hazard ratios) for each of the other decades compared to 40- to 49-year olds were: 50 to 59 years (hazard ratio, 1.41; 95% confidence interval [CI], 0.50–4.03; P = .517), 60 to 69 years (hazard ratio, 1.03; 95% CI, 0.35–3.00; P = .963), or 70 to 80 years (hazard ratio, 0.69; 95% CI; 0.25–1.88; P = .470). The vast majority (91%) of participants returned to baseline cognitive performance within 1 day of anesthesia (Table 2; Figure 2). The rapid rate of recovery to baseline accounts for the wide CIs on these measures; a design with more measurement points within the first day would improve resolution, but such differences are unlikely to be clinically meaningful. Two participants did not recover to baseline by day 30. One of these (60–69-year-old decade) returned within the tolerance limit of baseline performance for each of the 5 PQRS cognitive test items at least once during postanesthesia testing, but never all 5 during the same test administration, the criterion for recovery to baseline. The other (40–49-year-old decade) scored high on 1 cognitive test item (word generation) during the baseline test and did not return to this level in postanesthesia testing. Performance on all the other neuropsychological tests for these 2 participants was equal to or better than age norms and within a standard deviation of their baseline performance, so there was no evidence of POCD in these 2 participants. Both of these participants had returned to baseline on the PQRS at the 6-month PQRS follow-up test.

Secondary Outcomes

Table 3. - Summary of Results for Time Point by Age Decade Analyses (Linear Mixed Models) of Scores Summary on Each Secondary Outcome As Well As the Subtests of the NIH Toolbox Cognitive Battery
Test scores Age decade × time interaction Time point Least-squares-mean (standard error)
40–49 y (n = 20) 50–59 y (n = 19) 60–69 y (n = 13) 70–80 y (n = 19)
Secondary outcomes
 NIH Toolbox compositea F(9, 197) = 0.53, P = .85 Baseline 48.1 (1.5) 53.3 (1.5) 50.1 (1.9) 54 (1.5)
1 d 52.2 (1.5) 57.2 (1.5) 53.2 (1.9) 57.4 (1.5)
7 d 54.8 (1.5) 59.4 (1.5) 54.6 (1.9) 59.3 (1.5)
30 d 54.5 (1.5) 59.5 (1.5) 54.4 (1.9) 59.6 (1.5)
 Logical memory immediate F(9, 201) = 0.81, P = .61 Baseline 14.6 (0.8) 14.4 (0.8) 12.9 (1) 14.2 (0.8)
1 d 17.9 (0.8) 17.4 (0.8) 15.8 (1) 17.4 (0.8)
7 d 19.8 (0.8) 19.5 (0.8) 18 (1) 18.2 (0.8)
30 d 19.8 (0.8) 19.8 (0.8) 17.4 (1) 19.4 (0.8)
 Category fluency—animals F(9, 201) = 0.53, P = .85 Baseline 23.1 (1.6) 22.8 (1.6) 19.2 (2) 19.9 (1.6)
1 d 23.6 (1.6) 24.2 (1.6) 21 (2) 21.4 (1.6)
7 d 24.5 (1.6) 23.3 (1.6) 20.7 (2) 21 (1.6)
30 d 24 (1.6) 24.9 (1.6) 20.5 (2) 21.9 (1.6)
 Trails A time F(9, 201) = 0.47, P = .90 Baseline 30.1 (2.5) 28.6 (2.6) 28 (3.2) 37.9 (2.6)
1 d 25.1 (2.5) 26.5 (2.6) 25.9 (3.2) 35.3 (2.6)
7 d 25.6 (2.5) 25.6 (2.6) 23.8 (3.2) 33.2 (2.6)
30 d 22.5 (2.5) 23.9 (2.6) 24.8 (3.2) 32.1 (2.6)
 Trails B time F(9, 201) = 1.11, P = .36 Baseline 65.7 (8.5) 68.1 (8.8) 86 (10.7) 112.5 (8.7)
1 d 69.8 (8.5) 49.1 (8.8) 71.7 (10.7) 101.6 (8.7)
7 d 56.7 (8.5) 54.2 (8.8) 75.8 (10.7) 104.8 (8.7)
30 d 54.6 (8.5) 50.5 (8.8) 81 (10.7) 87.7 (8.7)
 Logical memory delayed F(9, 201) = 1.31, P = .24 Baseline 14.3 (0.8) 13.1 (0.9) 12.1 (1) 13.7 (0.8)
1 d 17.3 (0.8) 17.2 (0.9) 14.6 (1) 17 (0.8)
7 d 19.2 (0.8) 18.9 (0.9) 16.9 (1) 18.1 (0.8)
30 d 18.8 (0.8) 19.3 (0.9) 18.1 (1) 19 (0.8)
 CVLT delayed recall F(9, 201) = 1.46, P = .16 Baseline 10.7 (0.6) 10.7 (0.6) 9.2 (0.8) 9.4 (0.6)
1 d 13.4 (0.6) 13.5 (0.6) 12.2 (0.8) 11.5 (0.6)
7 d 14.5 (0.6) 14.7 (0.6) 13.2 (0.8) 11.7 (0.6)
30 d 14.4 (0.6) 15.1 (0.6) 12.9 (0.8) 12.1 (0.6)
NIH Toolbox subtests
 Dimensional change card sorta F(9, 197) = 0.62, P = .78 Baseline 40.9 (2.5) 46.1 (2.6) 54.1 (3.2) 50.3 (2.6)
1 d 47.1 (2.5) 47.8 (2.6) 54.6 (3.2) 53.1 (2.6)
7 d 48 (2.5) 50.1 (2.7) 57.2 (3.2) 55 (2.6)
30 d 47.3 (2.5) 52.1 (2.6) 58.9 (3.2) 54.3 (2.6)
 Flanker inhibitory control and attentiona F(9, 197) = 0.66, P = .74 Baseline 39.3 (2.7) 43.2 (2.9) 42 (3.5) 55.6 (2.8)
1 d 39.7 (2.7) 43.7 (2.9) 41.7 (3.5) 59.7 (2.8)
7 d 42.8 (2.7) 46.3 (2.9) 44.3 (3.5) 59.3 (2.8)
30 d 42.7 (2.7) 47.2 (2.9) 45.3 (3.5) 61.1 (2.8)
 Picture sequence memorya F(9, 197) = 3.09, P = .0017 Baseline 50 (2.4) 51.8 (2.5) 45.3 (3) 44.9 (2.4)
1 d 59.7 (2.4) 63.5 (2.5) 54.6 (3) 46.7 (2.4)
7 d 66 (2.4) 66.2 (2.5) 55.1 (3.1) 49.1 (2.4)
30 d 65.8 (2.4) 63.4 (2.5) 51.9 (3.1) 52 (2.4)
 List sorting working memorya F(9, 197) = 0.66, P = .75 Baseline 50.1 (2.2) 54.3 (2.3) 52.7 (2.8) 50 (2.2)
1 d 52.6 (2.2) 56.8 (2.3) 54.7 (2.8) 53.7 (2.2)
7 d 55.2 (2.2) 58.8 (2.3) 53.3 (2.8) 54.9 (2.2)
30 d 54.4 (2.2) 58.9 (2.3) 54.9 (2.8) 52.3 (2.2)
 Oral reading recognitionb F(9, 196) = 1.04, P = .41 Baseline 57.9 (2.4) 62.7 (2.5) 51.5 (3.1) 58.9 (2.5)
1 d 58.4 (2.4) 62.3 (2.5) 52.5 (3.1) 61 (2.5)
7 d 57.1 (2.4) 63 (2.5) 52.8 (3.1) 61.2 (2.5)
30 d 58.4 (2.4) 62.9 (2.5) 51.6 (3.1) 63.9 (2.5)
 Picture vocabularyc F(9, 192) = 1.59, P = .12 Baseline 54.1 (3.3) 61.5 (3.4) 50.6 (4.2) 64.9 (3.4)
1 d 53.8 (3.3) 64 (3.5) 52.9 (4.2) 65.9 (3.4)
7 d 56.1 (3.3) 63.6 (3.5) 52.9 (4.2) 66 (3.4)
30 d 54.2 (3.3) 66.9 (3.4) 51.3 (4.2) 64.9 (3.4)
 Pattern comparison processing speedd F(9, 196) = 0.54, P = .85 Baseline 44.4 (3.1) 53.4 (3.2) 54.2 (3.9) 53.1 (3.1)
1 d 54.1 (3.1) 62.3 (3.2) 61.2 (3.9) 62.8 (3.1)
7 d 58.5 (3.1) 67.9 (3.2) 66 (3.9) 70.2 (3.1)
30 d 58.5 (3.1) 64.9 (3.2) 66.7 (3.9) 70 (3.1)
Least-squares means are covariate-adjusted (gender, race, and education). P values shown are not adjusted for multiple comparisons.
Abbreviations: CVLT, California Verbal Learning Test; NIH, National Institutes of Health.
aData are missing for all NIH Toolbox tests for 2 participants at 1 occasion and for 1 participant at 2 occasions. N was 18 for 50–59 y at 1 and 7 d and 12 for 60–69 y at 7 and 30 d.
bData are missing for an additional participant in the 70–80-y group on the 30-d assessment. N was 18 for 50–59 y at 1 and 7 d, 12 for 60–69 y at 7 and 30 d, and 18 for 70–80 y at 30 d.
cData are missing for 5 additional participants at 1 time point for each participant. N was 18 for 50–59 y at 1 d, N was 17 for 50–59 y at 7 d, N was 12 for 60–69 y at 7 and 30 d, N was 17 for 70–80 y at 1 d, N was 18 for 70–80 y at 7 d, and N was 18 for 70–80 y at 30 d.
dData are missing for an additional participant in the 50–59-y group on the 30-d assessment. N was 18 for 50–59 y at 1, 7, and 30 d, and 12 for 60–69 y at 7 and 30 d.

Table 4. - Summary of Results for Time-to-Recovery Analyses (Discrete-Time Logit Regression) for the Primary Outcome, Secondary Outcomes, and Subtests of the NIH Toolbox Cognitive Battery
Time to recovery Overall effect of age decade HR versus 40–49-y old (N = 20) reference decade (95% CI)
50–59 y (n = 19) P 60–69 y (n = 13) P 70–80 y (n = 19) P
Primary outcome
 PQRSa χ2(3) = 1.83, P = .61 1.41 (0.50–4.03) .52 1.03 (0.35–3.00) .96 0.69 (0.25–1.88) .69
Secondary outcomes
 NIH Toolbox compositeb χ2(3) = 0.92, P = .82 0.40 (0.05–3.03) .37 0.47 (0.06–4.07) .50 0.44 (0.06–3.37) .43
 Logical memory immediate χ2(3) = 0, P = 1.00 1.00 (0.51–1.95) 1.00 1.00 (0.49–2.08) 1.00 1.00 (0.51–1.95) 1.00
 Category fluency—animals χ2(3) = 0.57, P = .90 1.10 (0.55–2.20) .78 1.29 (0.61–2.74) .50 1.25 (0.63–2.49) .53
 Trails A time χ2(3) = 0.05, P = .997 1.01 (0.52–1.98) .97 1.07 (0.51–2.23) .86 1.06 (0.53–2.11) .87
 Trails B time χ2(3) = 1.85, P = .60 1.59 (0.79–3.20) .20 1.37 (0.64–2.95) .42 1.18 (0.60–2.33) .64
 Logical memory delayed χ2(3) = 0, P = 1.00 1.00 (0.51–1.95) 1.00 1.00 (0.48–2.08) 1.00 1.00 (0.51–1.95) 1.00
 CVLT delayed recall χ2(3) = 0, P = 1.00 1.00 (0.51–1.95) 1.00 1.00 (0.48–2.08) 1.00 1.00 (0.51–1.95) 1.00
NIH Toolbox subtestsb
 Dimensional change card sort χ2(3) = 2.17, P = .54 0.67 (0.20–2.20) .50 1.02 (0.26–3.99) .98 1.89 (0.49–7.32) .35
 Flanker inhibitory control and attention χ2(3) = 0.54, P = .91 1.23 (0.38–4.02) .73 0.98 (0.23–4.19) .97 1.55 (0.43–5.53) .50
 Picture sequence memory χ2(3) = 6.62, P = .085 0.27 (0.02–3.03) .29 0.27 (0.02–3.55) .32 0.08 (0.01–0.76) .03
 List sorting working memory χ2(3) = 0.33, P = .95 1.08 (0.28–4.20) .91 0.78 (0.19–3.19) .73 1.23 (0.30–5.12) .78
 Oral reading recognition χ2(3) = 3.33, P = .34 1.51 (0.53–4.28) .44 3.44 (0.89–13.29) .07 1.70 (0.57–5.04) .34
 Picture vocabulary χ2(3) = 1.27, P = .74 1.63 (0.46–5.79) .45 0.83 (0.22–3.16) .78 1.57 (0.43–5.80) .49
 Pattern comparison processing speed χ2(3) = 1.05, P = .79 2.51 (0.33–18.93) .37 1.00 (0.12–7.99) .998 1.87 (0.25–14.16) .55
P values shown are not adjusted for multiple comparisons.
Abbreviations: CI, confidence interval; CVLT, California Verbal Learning Test; HR, hazard ratio; NIH, National Institutes of Health; PQRS = Postoperative Quality of Recovery Scale.
aN = 19 for 40–49 y and N = 18 for 70–80 y (due to missing data at baseline assessment).
bAs noted in Table 3, some participants were missing data for individual NIH Toolbox assessments at some time points. Times to recovery for these participants were determined based on available data.

For the NIH toolbox composite, there was no evidence that age decade was associated with postanesthesia test performance as a function of time, as would be the case if older participants tended to decline (or improve more slowly) in postanesthesia testing as compared to 40- to 49-year-old participants. The interactions between age decade and time point on the NIH Toolbox Cognitive composite score (mean across all 7 tests) and the paper-and-pencil tests were not significant (P values > .16; Table 3). There were also no significant associations of age decade with the proportion of patients returning to their baseline score on any of these measures (P values > .6; Table 4). Time to recovery on the PQRS (the primary outcome) and covariate-adjusted least-squares means for 3 selected cognitive measures (CVLT A delayed recall, Trail Making Test B time to complete, and NIH Toolbox Cognitive Battery composite score) representing different cognitive domains and test modalities are illustrated in Figure 2, showing similar improvement in performance across time on all 3 secondary measures for all the age decades. We also analyzed each of the 7 subtests of the NIH Toolbox Cognitive Battery, which showed no associations of age decade with time to return to performance equal to or greater than their baseline score on any of the subtest measures. For scores on 6 of 7 subtests, interactions between age decade and time were nonsignificant. For the picture sequence memory test, the interaction was significant (P = .0017). This test requires participants to place a sequence of images in order based on a verbal narrative. It includes 3 forms (different narratives and pictures), but our participants saw the same pictures on each of the 4 test occasions. As such, the age by time interaction on this test may reflect chance, a real residual effect of anesthesia, or age differences in strategy on this specific test. Detailed statistical results for the analyses of primary and secondary outcomes are presented in Table 3 (LMMs for secondary outcomes) and Table 4 (time to recovery analyses for primary and secondary outcomes).


In this study, we found no association between age decade and time to cognitive recovery within 30 days from anesthesia in healthy volunteers without dementia. In general, recovery was rapid, with 91% of participants recovered to baseline on the primary endpoint measure within 1 day of anesthesia and 97% recovered on this measure within 30 days. There were no differences between older participants (50–59-, 60–69-, or 70–80-year old) relative to 40- to 49-year-old participants. There was no indication of differences among age decades on postanesthesia cognitive function on any of the computerized or paper-and-pencil cognitive tests that were our secondary measures, with participants maintaining or improving their performance on these tests postanesthesia with no differences in postanesthesia cognitive performance as a function of age (with the exception of 1 of 7 subtests of the NIH Toolbox Cognitive Battery). Strengths of this study include our ability to model changes in recovery and cognitive performance after anesthesia without the bias of comorbid condition or of surgical intervention, both of which are likely to play a role in cognitive recovery. An additional strength is the use of comprehensive pre- and postanesthesia cognitive testing, including a brief assessment used in the perianesthesia setting, well-validated paper-and-pencil neuropsychological tests, and a computerized test battery. Moreover, the participants in our study included 2 categories of individuals in the age range most vulnerable to POD and POCD.2,20 We were also able, in a secondary study from this sample, to determine that general anesthesia did not affect plasma biomarkers of neural injury or Alzheimer’s disease,21 providing further evidence that advanced age per se is not associated with neurobiological impairment after general anesthesia.

There are also important limitations of our study. The use of healthy, nonsurgical participants with normal cognition at baseline may limit the applicability to clinical populations, who may have significant comorbidities and/or poor preoperative cognition that are risk factors for postoperative neurocognitive disorders.22 This also limited our ability to include ASA status and the number of medications as covariates in analyses because of small N. The anesthetic administered was dose adjusted for age, and depth of anesthesia was monitored and limited to 2 hours. Many elderly patients are overdosed in general practice, not monitored for anesthetic depth, and surgeries can last for many hours. Thus, the effect of overdose of anesthetics in elderly patients is unclear, and it is unlikely for ethical reasons that such a study will be conducted. We only examined sevoflurane general anesthesia, so our results may not generalize to other anesthetic agents. However, available evidence suggests that volatile versus IV anesthesia either does not affect the incidence of postoperative neurocognitive disorders or that risk is increased after volatile anesthesia.23–26 Our design used repeated testing such that practice effects may have obscured subtle differences. Indeed, participants generally improved on tests with time, suggestive of practice effects. We lacked a comparison group that did not receive anesthesia; however, all participants underwent testing on the same schedule, which would have allowed us to separate interactions of age decade with anesthesia from practice effects on the neuropsychological tests. Although this is an important consideration, clear age-related increases in incidence of POD and POCD2,20 make age differences by decade in time to recovery the primary question of interest, which we were able to investigate without a nonanesthesia comparison group. Finally, the interpretation of a lack of difference between age decades can be challenging. The need to recruit healthy participants and expose them to 2 hours of general anesthesia for research purposes was balanced against the sample size required for statistical power to detect a reasonable association of age with a rate of cognitive recovery after anesthesia. Although the PQRS cognitive scale was very sensitive to the time postanesthesia in all age decades, recovery did not vary across the age decades. We cannot rule out a more subtle effect of age that we were not adequately powered to detect with our sample size, which included a total of 71 participants of whom 32 were aged 60 or over. However, after day 1, there were only 2 persons 40 to 49, 1 person 50 to 59, 1 person 60 to 69, and 2 persons 70- to 80-year old yet to recover. Thus, rather than more persons enrolled, we would need a design with more measures within the first day to refine the estimate of time to recovery, but such small intervals may not reflect clinically important differences.

We found no association between age category and time to cognitive recovery after general anesthesia, although the strength of this inference is limited by our sample size. Given that surgery under general anesthesia elevates biomarkers of neural injury,27 surgical stress and inflammation may be the primary culprit, although it is a logical possibility that anesthesia might exacerbate these responses even if it does not cause them on its own21 or that anesthetic depth during surgery may moderate neurophysiology in meaningful ways that affect postoperative cognition.28–30 Finally, we cannot exclude that there are vulnerable subgroups based on preoperative comorbidity or the presence of geriatric syndromes such as frailty or cognitive impairment. Future studies may seek to focus on these patients to determine whether there is an association of anesthetic technique with cognitive recovery in patients with preexisting geriatric syndromes.


Dr Jeffrey Silverstein MD (Icahn School of Medicine at Mount Sinai, original primary investigator, deceased), many thanks to Rachelle Jacoby, Kirklyn Escondo, Jong Kim, Matthew Hartnett, Carolyn Fan, and Jim Leader (all of the Department of Anesthesiology, Perioperative and Pain Medicine, Icahn School of Medicine at Mount Sinai) for their assistance with this study.


Name: Mark G. Baxter, PhD.

Contribution: This author helped conduct the study, analyze the data, and write the manuscript.

Conflicts of Interest: M. G. Baxter has been a consultant for Unity Biotechnology.

Name: Joshua S. Mincer, MD, PhD.

Contribution: This author helped conduct the study and write the manuscript.

Conflicts of Interest: None.

Name: Jess W. Brallier, MD.

Contribution: This author helped conduct the study.

Conflicts of Interest: None.

Name: Arthur Schwartz, MD.

Contribution: This author helped conduct the study.

Conflicts of Interest: None.

Name: Helen Ahn, MD.

Contribution: This author helped conduct the study.

Conflicts of Interest: None.

Name: Tommer Nir, MD, PhD.

Contribution: This author helped conduct the study.

Conflicts of Interest: None.

Name: Patrick J. McCormick, MD, MEng.

Contribution: This author helped conduct the study.

Conflicts of Interest: P. J. McCormick’s spouse holds equity in Johnson & Johnson.

Name: Mohammed Ismail, BS.

Contribution: This author helped conduct the study.

Conflicts of Interest: None.

Name: Margaret Sewell, PhD.

Contribution: This author helped conduct the study.

Conflicts of Interest: None.

Name: Heather G. Allore, PhD.

Contribution: This author helped analyze the data and write the manuscript.

Conflicts of Interest: None.

Name: Christine M. Ramsey, PhD.

Contribution: This author helped analyze the data.

Conflicts of Interest: None.

Name: Mary Sano, PhD.

Contribution: This author helped conduct the study and write the manuscript.

Conflicts of Interest: M. Sano has been a consultant for VTV Therapeutics, Hoffman-LaRoche, Biogen, CogRx, Bracket, Eisai, and Eli Lilly and Company, a member of the DSMB for AZTherapies and for NIA “ASPREE,” and an adjudicator for Trial Endpoint for Takeda Pharmaceutical.

Name: Stacie G. Deiner, MD.

Contribution: This author helped conduct the study and write the manuscript.

Conflicts of Interest: S. G. Deiner has served as a consultant for Merck and Covidien and received product support from Covidien and CASMED (processed electroencephalogram and oximetry monitors and sensors) and as an expert witness for legal affairs.

This manuscript was handled by: Robert Whittington, MD.


    1. Evered LA, Silbert BS. Postoperative cognitive dysfunction and noncardiac surgery. Anesth Analg. 2018;127:496–505.
    2. Inouye SK, Westendorp RG, Saczynski JS. Delirium in elderly people. Lancet. 2014;383:911–922.
    3. Witlox J, Eurelings LS, de Jonghe JF, Kalisvaart KJ, Eikelenboom P, van Gool WA. Delirium in elderly patients and the risk of postdischarge mortality, institutionalization, and dementia: a meta-analysis. JAMA. 2010;304:443–451.
    4. Daiello LA, Racine AM, Yun Gou R, et al.; SAGES Study Group*. Postoperative delirium and postoperative cognitive dysfunction: overlap and divergence. Anesthesiology. 2019;131:477–491.
    5. Rudolph JL, Marcantonio ER, Culley DJ, et al. Delirium is associated with early postoperative cognitive dysfunction. Anaesthesia. 2008;63:941–947.
    6. Rudolph JL, Jones RN, Levkoff SE, et al. Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery. Circulation. 2009;119:229–236.
    7. Tang JX, Eckenhoff MF. Anesthetic effects in Alzheimer transgenic mouse models. Prog Neuropsychopharmacol Biol Psychiatry. 2013;47:167–171.
    8. Rasmussen LS, Johnson T, Kuipers HM, et al.; ISPOCD2 (International Study of Postoperative Cognitive Dysfunction) Investigators. Does anaesthesia cause postoperative cognitive dysfunction? A randomised study of regional versus general anaesthesia in 438 elderly patients. Acta Anaesthesiol Scand. 2003;47:260–266.
    9. Evered L, Scott DA, Silbert B, Maruff P. Postoperative cognitive dysfunction is independent of type of surgery and anesthetic. Anesth Analg. 2011;112:1179–1185.
    10. Mincer JS, Baxter MG, McCormick PJ, et al. Delineating the trajectory of cognitive recovery from general anesthesia in older adults: design and rationale of the TORIE (Trajectory of Recovery in the Elderly) project. Anesth Analg. 2018;126:1675–1683.
    11. Royse CF, Newman S, Chung F, et al. Development and feasibility of a scale to assess postoperative recovery: the post-operative quality recovery scale. Anesthesiology. 2010;113:892–905.
    12. Royse CF, Newman S, Williams Z, Wilkinson DJ. A human volunteer study to identify variability in performance in the cognitive domain of the Postoperative Quality of Recovery Scale. Anesthesiology. 2013;119:576–581.
    13. Heaton RK, Akshoomoff N, Tulsky D, et al. Reliability and validity of composite scores from the NIH Toolbox Cognition Battery in adults. J Int Neuropsychol Soc. 2014;20:588–598.
    14. Weintraub S, Dikmen SS, Heaton RK, et al. Cognition assessment using the NIH Toolbox. Neurology. 2013;80:S54–S64.
    15. Weintraub S, Salmon D, Mercaldo N, et al. The Alzheimer’s Disease Centers’ Uniform Data Set (UDS): the neuropsychologic test battery. Alzheimer Dis Assoc Disord. 2009;23:91–101.
    16. Nir T, Jacob Y, Huang KH, et al. Resting-state functional connectivity in early postanaesthesia recovery is characterised by globally reduced anticorrelations. Br J Anaesth. 2020;125:529–538.
    17. Allison PD. Discrete-time methods for the analysis of event histories. Sociol Methodol. 1982;13:61.
    18. Singer JD, Willett JB. It’s about time: using discrete-time survival analysis to study duration and the timing of events. J Educ Stat. 1993;18:155–195.
    19. PASS 12 Power Analysis and Sample Size Software. 2012.NCSS, LLC;
    20. Rasmussen LS, Larsen K, Houx P, Skovgaard LT, Hanning CD, Moller JT; ISPOCD group; The International Study of Postoperative Cognitive Dysfunction. The assessment of postoperative cognitive function. Acta Anaesthesiol Scand. 2001;45:275–289.
    21. Deiner S, Baxter MG, Mincer JS, et al. Human plasma biomarker responses to inhalational general anaesthesia without surgery. Br J Anaesth. 2020;125:282–290.
    22. Devore EE, Fong TG, Marcantonio ER, et al. Prediction of long-term cognitive decline following postoperative delirium in older adults. J Gerontol A Biol Sci Med Sci. 2017;72:1697–1702.
    23. Zhang Y, Shan GJ, Zhang YX, et al.; First Study of Perioperative Organ Protection (SPOP1) investigators. Propofol compared with sevoflurane general anaesthesia is associated with decreased delayed neurocognitive recovery in older adults. Br J Anaesth. 2018;121:595–604.
    24. Miller D, Lewis SR, Pritchard MW, et al. Intravenous versus inhalational maintenance of anaesthesia for postoperative cognitive outcomes in elderly people undergoing non-cardiac surgery. Cochrane Database Syst Rev. 20182018:CD012317.
    25. Qiao Y, Feng H, Zhao T, Yan H, Zhang H, Zhao X. Postoperative cognitive dysfunction after inhalational anesthesia in elderly patients undergoing major surgery: the influence of anesthetic technique, cerebral injury and systemic inflammation. BMC Anesthesiol. 2015;15:154.
    26. Hussain M, Berger M, Eckenhoff RG, Seitz DP. General anesthetic and the risk of dementia in elderly patients: current insights. Clin Interv Aging. 2014;9:1619–1628.
    27. Evered L, Silbert B, Scott DA, Zetterberg H, Blennow K. Association of changes in plasma neurofilament light and tau levels with anesthesia and surgery: results from the CAPACITY and ARCADIAN Studies. JAMA Neurol. 2018;75:542–547.
    28. Deiner S, Luo X, Silverstein JH, Sano M. Can intraoperative processed EEG predict postoperative cognitive dysfunction in the elderly? Clin Ther. 2015;37:2700–2705.
    29. Hesse S, Kreuzer M, Hight D, et al. Association of electroencephalogram trajectories during emergence from anaesthesia with delirium in the postanaesthesia care unit: an early sign of postoperative complications. Br J Anaesth. 2019;122:622–634.
    30. Soehle M, Dittmann A, Ellerkmann RK, Baumgarten G, Putensen C, Guenther U. Intraoperative burst suppression is associated with postoperative delirium following cardiac surgery: a prospective, observational study. BMC Anesthesiol. 2015;15:61.
    Copyright © 2021 International Anesthesia Research Society