The elderly constitute 12% of the population,1 but undergo one-third of the surgical procedures in the United States. Postoperative delirium (POD) is an acute change in cognitive status characterized by fluctuating consciousness and inattention occurring after surgery.2 It may affect up to 60% of certain elderly populations,3 and is associated with increased mortality, morbidity, and resource utilization.4 Because most operations are elective, there may be opportunity to address modifiable risk factors before surgery or pursue perioperative management strategies that could decrease the risk of POD. Many studies of POD have used dementia screening tools to assess cognitive status.5 In these, the most important patient characteristic that predicts POD is preoperative cognitive dysfunction as measured with these instruments.6,7 However, most older patients presenting to surgery do not have obvious cognitive impairment, and less work has been done to explore the influence on the risk of POD of the more subtle degrees of cognitive impairment present in many elderly patients.8–10
In addition, the effect of POD on cognitive and functional outcomes in patients undergoing elective (as opposed to urgent) surgery is not known. Delirium in hospitalized medical patients is associated with long-term cognitive and functional decline.11–15 However, unlike many medical illnesses, surgery is episodic in nature. Thus, POD may be self-limiting and without adverse long-term cognitive and functional implications. Alternatively, it may be an indicator of limited cognitive and functional reserve and portend future decline.
The aims of this prospective cohort study in patients undergoing elective hip or knee arthroplasty were to: (1) determine whether sensitive neurocognitive tests in those without clinically apparent cognitive impairment predict POD, and (2) determine whether POD predicts decline in cognitive and functional outcomes 3 months postoperatively. We hypothesized that (1) even subtle decreases in preoperative cognitive function predict the occurrence of POD, and (2) the occurrence of POD is an independent predictor of diminished cognitive and functional status at 3 months after surgery.
After receiving IRB approval, patients ≥65 years old presenting to Mayo Clinic Rochester for elective hip or knee arthroplasty were screened for participation. Exclusion criteria included nonfluency in English, inability to give informed consent, and preoperative delirium. In addition, because one of the primary aims was to assess whether scores on sensitive neurocognitive tests in those without clinically apparent cognitive impairment predict POD, patients with baseline Mini-Mental State Examination (MMSE)16 scores ≤23 were excluded from further study after preoperative assessment.
At least 1 day before surgery, enrolled patients who had given informed written consent underwent assessment for cognitive and functional status, depression, alcohol abuse, and delirium. Neurocognitive tests were chosen to evaluate a range of domains. They included the following: (1) MMSE,16 a measure of global cognitive function. Scores may range from 0 to 30, with ≤23 indicating impairment. (2) The American National Adult Reading Test,17 a test of verbal intelligence. Scores for the American National Adult Reading Test are scaled with a mean of 100 and a standard deviation of 15. (3) The Auditory Verbal Learning Test (AVLT),18 a test of verbal learning and memory. Raw data from the 3 portions of the test (Learning Efficiency, Percent Retention, and Delayed Recall) were converted to age- and IQ-adjusted scaled scores (mean 100, SD 15) using data from Mayo's Older Americans Normative Studies.19,20 (4) The Controlled Word Association Test (COWAT)21 measured verbal fluency. Again, raw scores were converted to age- and IQ-adjusted scaled scores (mean 10, SD 3) using data from Mayo's Older American Normative Studies.22 (5) Executive function was evaluated using the Stroop Color-Word Test (SCWT).23 Using data from Mayo's Older American Normative Studies, raw scores were converted to age- and IQ-adjusted scaled scores (mean 50, SD 10).22
The 20-item version of the Center for Epidemiological Studies Depression Scale24 was used to screen for depression. Scores can range from 0 to 60. Scores ≥16 indicate clinically significant psychological distress. Alcohol abuse was assessed using the CAGE questionnaire.25 Affirmative answers to ≥2 of the 4 items indicate alcohol abuse. The Specific Activity Scale26 assessed physical function. In this scale, individuals able to generate ≥7 metabolic equivalent tasks without symptoms are considered Class I; Classes II, III, and IV include individuals able to generate 5 to 7, 2 to 5, and <2 metabolic equivalent tasks, respectively. Functional status was evaluated using the activities of daily living (ADL) scale27 and the instrumental activities of daily living (IADL) scale.28 Delirium was assessed using the Confusion Assessment Method (CAM).29 The neurocognitive tests were administered in the same order for each patient by trained psychometricians and were scored by a neuropsychologist (MRT). The evaluation took approximately 55 minutes. All tests have been validated previously.
Anesthesia and analgesia management was at the discretion of the clinicians caring for the patients.
Postoperative Assessments in Hospital
The primary end point of POD was assessed by a trained research nurse who applied the CAM diagnostic algorithm twice per day, starting on the first postoperative day, and continuing until hospital discharge. A priori, we defined POD as a positive CAM on or before postoperative day 4. Patients who developed a positive CAM after postoperative day 4 were not considered to have POD; rather, we assumed that delirium in these patients was related to some postoperative event rather than the operation per se. These patients were not eligible to be controls in the follow-up portion of the study (see below). In addition, we collected information on ASA physical status, perianesthetic management, and patient demographics.
Assessments After Hospital Discharge
A case-control design was used to determine whether POD predicts declines in cognitive and functional outcomes at 3 months postoperatively. For each case of POD, a control patient was selected based on matching of age, gender, and procedure. In instances whereby there were multiple possible control patients, the one whose initial MMSE score was closest to that of the patient with POD was chosen. Each case and control was invited to return for repeated neurocognitive and functional evaluation 3 months after surgery. If a control patient declined to return for testing, another was not selected so as not to bias the evaluation according to willingness to return.
The relationship between individual preoperative characteristics and development of POD was tested using logistic regression. To identify a set of characteristics independently associated with the development of POD, a multiple logistic regression analysis was performed. For this analysis, all variables included in Tables 1 and 2 were entered in the first step and a backward elimination algorithm was used to eliminate nonsignificant variables. To assess the replication stability of this approach, bootstrap resampling was also used to identify the covariates most associated with the development of POD.30 For this procedure, 1000 bootstrap samples were constructed, and for each sample, a multiple logistic regression analysis was performed using a stepwise variable selection algorithm to identify the significant variables. This approach assumes that the prognostically important variables should be included in the final model for most bootstrap samples. The percentage of final bootstrap models that included a given variable was evaluated as a criterion for the prognostic importance of the variable.30
The neurocognitive and functional outcomes of patients who developed POD and controls who did not were compared at assessments conducted preoperatively and 3 months after surgery using the 2-sample t test. For each group, the preoperative to postoperative change was compared to zero using the 1-sample t test. Paired comparisons between cases and controls were made using the Student t test. In addition to analyzing the changes in neurocognitive test scores as continuous variables, we also used changes in these scores to identify individuals who experienced clinically relevant postoperative cognitive decline (POCD). According to a standard definition,31 patients were considered to have experienced POCD if they declined by >1 SD on at least 2 of the 5 tests. The percentage of patients who experienced POCD was compared between those with POD versus controls using the Fisher exact test.
Data are presented as mean ± SD for continuous variables and number (percentage) for categorical variables. P values <0.05 are considered statistically significant.
Sample Size/Power Considerations
The primary aim was to assess whether lower scores on sensitive neurocognitive tests are an independent risk factor for POD. The sample size was based on the assumption that the incidence of POD would be approximately 10%. Under this assumption, a sample size of approximately 400 would be required to identify 40 patients who develop POD. In general, for comparing 2 groups, a sample size of 40 delirium patients and 360 nondelirium patients provides statistical power (2-tailed, α = 0.05) of approximately 85% to detect a difference between groups of 0.5 SD units. Thus, a total sample size of n = 400 would provide adequate statistical power to detect a medium effect size for the primary aim.32 The second aim of the study was to determine whether POD is associated with a decline in cognitive function 3 months after surgery. Because the sample size was determined for the primary aim only, findings from analyses used to investigate the secondary aim are presented using point estimates and corresponding 95% confidence intervals (CIs).
Between November 2002 and June 2005, 431 patients were enrolled. Three were excluded because the surgical procedure was changed or canceled. Two were excluded because of preexisting diagnoses of cognitive impairment that were discovered upon medical record review after enrollment. After baseline testing, 9 patients were excluded because of low MMSE scores (≤23). Thus, this report includes data from 418 patients (Table 1). POD, as defined by a positive CAM before postoperative day 4, occurred in 42 of 418 patients (10.0%, 95% CI 7.0%–13.0%). Three additional patients developed delirium after postoperative day 4. All had significant medical complications and developed delirium after admission to the intensive care unit, 2 for myocardial infarction and 1 for pulmonary edema. By design, these patients were not considered to have POD. However, the results presented below do not change if data from these patients are included in the analysis. In all instances, POD resolved by the time of hospital discharge.
Unless otherwise specified, all univariate predictors of POD that were considered are presented in Tables 1 and 2. There were no differences between those who did and did not develop POD with regard to gender, surgical procedure, depressive symptoms (as measured by the Center for Epidemiological Studies Depression Scale), or medical and physical status, as measured by ASA physical status and Specific Activity Scale scores (not shown). In addition, there were no statistically significant differences in the prevalence of hypertension, congestive heart failure, coronary artery disease, cerebrovascular disease, asthma, chronic obstruction pulmonary disease, diabetes mellitus, renal disease, or tobacco use between those who did and did not develop POD (data not shown). Alcohol abuse (as measured by the CAGE questionnaire) was infrequent and comparable between groups. Intraoperative anesthetic technique did not influence the incidence of POD. Patients with POD were slightly older than those without POD, and more had a history of psychiatric illness (see Table 1 for definitions). Finally, preoperative ADL and IADL scores were significantly lower in those who developed POD (Table 1).
There were no differences in years of education, or age-adjusted verbal intelligence or baseline MMSE scores between those who did and did not develop POD (Table 2). However, age-adjusted verbal learning and memory scores as assessed by the Learning Efficiency, Percent Retention, and Delayed Recall portions of the AVLT were lower in patients who subsequently developed POD. Likewise, age-adjusted verbal fluency (as measured by the COWAT) and executive function (as assessed by the SCWT) scores were lower at baseline in those who developed POD.
In multivariate analysis using a stepwise variable selection algorithm, independent predictors of POD included lower functional status (measured by the ADL scale), lower cognitive function (verbal memory as indicated by AVLTPercent Retention), older age, and history of psychiatric illness (Table 3). Given the possibility that a model identified via a stepwise algorithm will overfit the observed data, a bootstrap resampling approach was also used to identify important patient or procedural characteristics predictive of POD. The results of this are summarized in Table 4. A functional status variable (ADL or IADL) was included in 89% of the final bootstrap models, with ADL included in the majority (67%). One or more of the cognitive function scales was included in 97% of the final bootstrap models, with AVLTPercent Retention included most frequently (58%). Age and history of psychiatric illness were less robust and included in 37% and 51% of the final models, respectively.
The duration of anesthesia was similar for patients who developed POD compared with those who did not develop POD (mean ± SD: 3.5 ± 1.2 vs 3.4 ± 1.0 hours for POD versus no POD, respectively; P = 0.328). There were no perioperative deaths. The hospital length of stay was increased for patients who developed POD versus not (mean ± SD: 6.5 ± 2.2 vs 5.2 ± 1.2 days, median 6 vs 5 days, rank sum test P < 0.001). Of the 42 patients who developed POD, 12 experienced ≥1 (14 total) postoperative complications (6 atrial fibrillation, 4 infection, 3 pulmonary embolism, and 1 myocardial infarction). Of the 376 patients who did not develop POD, 26 experienced ≥1 (28 total) postoperative complications (13 infection, 10 atrial fibrillation, 3 myocardial infarction, 1 deep venous thrombosis, and 1 pulmonary embolism). The percentage of patients experiencing ≥1 postoperative complications was higher among patients who developed POD versus patients who did not (28.6% vs 6.9%, P < 0.001).
Of the 42 patients who developed POD (cases), 37 were available for follow-up neurocognitive and functional testing at 3 months, as were 33 controls. For all instances in which follow-up testing was not completed, the reason was participant refusal. There were no patient deaths or instances of debility sufficient to prevent testing. There were 31 matched pairs who underwent follow-up testing. Test results are summarized in Tables 5 and 6. Raw data are presented as supplemental digital content in online Figures 1 to 5 (see Supplemental Digital Content 1 to 5: Fig. 1, http://links.lww.com/AA/A241; Fig. 2, http://links.lww.com/AA/A242; Fig. 3, http://links.lww.com/AA/A243; Fig. 4, http://links.lww.com/AA/A244; and Fig. 5, http://links.lww.com/AA/A245; see Appendix for online figure legends). Consistent with the analysis above, preoperative values for most tests were lower in patients who developed POD. For cases and controls, the differences between post- and preoperative values in age-adjusted scores on the SCWT, COWAT, and the Learning Efficiency, Percent Retention, and Delayed Recall portions of the AVLT varied widely in individual patients, as did changes in functional status. Learning Efficiency significantly increased both in cases and controls, Percent Retention and Delayed Recall significantly increased in cases, and the SCWT scores significantly increased in controls. There was no significant difference in the changes in scores from baseline to 3 months postoperatively on the neurocognitive or functional status tests between cases and controls. Similarly, there were no differences in the changes in scores using case-control pair comparisons. Thus, in this small group, POD was not associated with an overall mean decline in neurocognitive or functional status 3 months after surgery. In addition, if POCD is defined for each patient as a decline of >1 SD on at least 2 neurocognitive tests,31 there were 3 of 33 (9.1%) control patients and 2 of 37 (5.4%) POD patients with POCD 3 months postoperatively (Fisher exact test P = 0.66).
POD is among the most common postoperative complications in the elderly and is associated with increased mortality, morbidity, hospital costs, and discharge to long-term care facilities. Thus, understanding its predictors and sequelae has considerable significance.
In a prediction model of delirium in hospitalized medical patients, independent predictors included older age, severe illness, dehydration, and cognitive and visual impairment.33 This model was validated in patients undergoing elective orthopedic surgery.7 Age, type of procedure, physical status, glucose and electrolyte abnormalities, and diminished functional and cognitive status were independent predictors in a model for POD in noncardiac surgical patients.6 However, these studies used global measures of cognitive status such as the MMSE. Although sufficient for dementia screening, those instruments are not designed to detect subtle cognitive changes.
We hypothesized that lower scores on sensitive neurocognitive tests predict POD, a hypothesis supported by our results. By design, no patients included in this study had clinically apparent cognitive deficits. The neurocognitive test battery used in this study was sensitive enough to detect subtle differences in a variety of cognitive domains. Although clinically normal, patients who developed POD had decreased performance on the tests compared with those who did not, despite similar educational levels and verbal intelligence and MMSE scores. We can only speculate why lower scores on the neurocognitive tests predicted POD. Whether these patients are more likely to develop clinically significant cognitive deterioration is not known. The lower scores in the POD patients may be early signs of neurodegenerative diseases. Alternatively, they simply may be indicative of diminished baseline cognitive reserve.
Two recent studies found that tests of executive function and depressive symptoms were independent risk factors for POD in noncardiac surgical patients.9,10 Although we did not find depressive symptoms to be predictive of POD in this study, participants had low levels of these at baseline. However, history of psychiatric illness was an independent risk factor for POD, and depression was the most common psychiatric diagnosis. Together, this suggests that depression indeed has predictive value for POD.
In our study, we used the SWCT and COWAT to measure executive function and the AVLT to assess verbal learning and memory. Although lower scores on the SWCT, COWAT, and all 3 portions of the AVLT were univariate predictors of POD, only the AVLTPercent Retention was an independent predictor. The AVLTPercent Retention is a measure of short-term memory. Thus, we found that memory, not executive function, is an independent risk factor for POD. Although preoperative subjective memory complaints are predictive of POD in cardiac surgical patients,34 Rudolph et al.35 reported that lower scores on formal tests of memory are not predictive in that population. We are unaware of other studies demonstrating that memory is a predictor of POD. Greene et al.9 and Smith et al.10 found that executive function predicts POD, but neither reported scores for tests of memory. In both studies, patients were considerably younger than those in our study. In the Rudolph et al. study, patients were similar in age to our cohort, but they had cardiac surgery instead of elective lower extremity joint replacement surgery. Whether age has a role in determining which neurocognitive domains are most predictive of POD is not clear. More research is needed to determine which domains of cognitive function are most important in predicting POD and whether age or other factors alter which domains are most important.
Although the neurocognitive findings are interesting, routine preoperative formal neurocognitive testing is time consuming, expensive, and requires trained personnel. A more clinically useful predictor of POD would be helpful. In this study, the preoperative ADL score was an independent risk factor for POD. ADL scores are quickly obtained via questionnaire or allied health personnel interview, and add no cost to the preoperative examination. Thus, functional status assessment during the preoperative evaluation is feasible and could contribute to preoperative risk stratification.
Although not the goal of the project, the independent risk factors for POD found in this study (age, ADL and AVLTPercent Retention scores, and history of psychiatric illness) could form the basis of a clinical prediction rule to assess the likelihood of POD. Further study would be necessary to validate the strength of this model and its applicability to other surgical populations.
The other key finding in this study is that POD does not predict diminished cognitive or functional status 3 months postoperatively. In medical patients, delirium predicts long-term cognitive and functional decline.11–15 Similarly, POD predicts persistent cognitive impairment after hip fracture surgery.36 POD also predicts functional decline in hip fracture surgery patients. For example, POD is an independent predictor of declines in ADL scores and ambulation, and of an increased likelihood of death or nursing home placement.37 However, falls leading to hip fracture are often the result of concurrent medical illness and surgery is performed urgently, both of which may affect the pathogenesis of POD.
Cognitive decline temporally related to surgery is known as POCD. POCD is a research construct with definitions that vary by type of cognitive testing, degree of decline required for diagnosis, and time course.38 These variations in methodology make comparisons across studies of POCD difficult.
The relationship between POCD and POD is unclear. The 2 entities may be a continuum of postoperative brain dysfunction wherein POD leads to POCD. Alternatively, they may be distinct and unrelated. Two studies using a z score definition of POCD have examined the relationship between POD and POCD.39,40 In both, POD was a predictor of POCD 1 week postoperatively, but not at 3 months.
In this study, changes in neurocognitive test scores 3 months postoperatively were not significantly different between patients experiencing POD and controls. This, and the studies noted above, suggest that after elective surgery, POD may not lead to POCD at 3 months postoperatively. In fact, mean neurocognitive test scores were improved 3 months postoperatively in both POD cases and controls. We cannot exclude that practice effects may have had some role in this. However, we attempted to minimize these effects by not retesting until 3 months postoperatively and, when available, using alternate test forms at follow-up. The improvement in test scores at 3 months may have also been influenced by the timing of preoperative testing. This was usually accomplished the day before surgery. It is possible that presurgical anxiety may have resulted in poorer performance at that time.41
In normal subjects, higher levels of educational attainment are protective against cognitive decline and dementia.42 However, higher levels of educational attainment may39 or may not43 be protective against POCD 3 months postoperatively. The participants in our study were relatively highly educated compared with those in other studies. We excluded the possibility that this had some role in our inability to find a difference in neurocognitive test scores between baseline and 3 months postoperatively in POD patients versus controls.
POD also did not affect functional status 3 months postoperatively. This, combined with the neurocognitive data, suggests that, unlike medical patients or those undergoing urgent procedures such as hip fracture repair, POD after elective surgery, once resolved, is not associated with functional decline 3 months postoperatively. Taken together, the follow-up neurocognitive and functional data suggest that, in elderly patients with good baseline functional and global cognitive status, anesthesia and elective surgery are not associated with cognitive or functional decline at 3 months.
This study has several limitations. First, it was conducted at a single institution on a homogeneous group of patients. It is unclear whether the results are generalizable to other patient populations, procedures, and institutions. However, the study design attempted to maximize the reliability of the neurocognitive and functional test results and limit confounders. One potential confounder is postoperative pain. We were not able to obtain pre- or postoperative pain scores. Pain can affect neurocognitive test scores44,45 and postoperative pain is associated with POD.46 Again, the study design attempted to account for this by including a homogeneous group of patients having similar operations. In addition, policy at our institution during the period of the study was for aggressive pain management with a goal numerical pain score of ≤3 of 10. We believe that these factors mitigated the effect of pain on the incidence of POD. Second, we developed the multivariate model using a stepwise algorithm. With this approach, there is the possibility of overfitting the observed data and suppression of some covariates. To account for this, we used bootstrap resampling to identify important patient or procedural characteristics predictive of POD. Third, patients' baseline neurocognitive and functional status were relatively high. POD may predict further cognitive decline only in patients with lower baseline cognitive and functional status. Similarly, although POD did not predict cognitive and functional decline 3 months postoperatively, only 37 patients with POD were available for follow-up testing and the SDs and CIs on the tests were large relative to the mean differences (Table 6). Thus, we cannot exclude a small effect on decline or that decline occurs later than 3 months. Finally, our study does not have adequate statistical power to compare groups using a dichotomous end point (POCD/no POCD). However, based on the distributions of the change scores observed in our study, we are not convinced that a dichotomous end point is the most appropriate way to evaluate changes in cognitive function. Thus, we treated these data in a continuous manner. This approach is consistent with the literature concerning cognitive decline in nonsurgical patients.47
In summary, diminished functional status and lower scores on sensitive neurocognitive tests predict POD in elderly patients undergoing elective total joint arthroplasty. Simple preoperative functional testing may help identify those patients at risk for POD. In this study, POD did not predict cognitive or functional decline at 3 months, suggesting that in this population, POD may not lead to adverse cognitive or functional sequelae and that POD and POCD may be clinically distinct entities. Further study is necessary to more clearly define this relationship.
1. Anonymous. Profile of General Demographic Characteristics: 2000. US Census Bureau, 2000
2. Dyer CB, Ashton CM, Teasdale TA. Postoperative delirium: a review of 80 primary data-collection studies. Arch Intern Med 1995;155:461–5
3. Bitsch M, Foss N, Kristensen B, Kehlet H. Pathogenesis of and management strategies for postoperative delirium after hip fracture: a review. Acta Orthop Scand 2004;75:378–89
4. Franco K, Litaker D, Locala J, Bronson D. The cost of delirium in the surgical patient. Psychosomatics 2001;42:68–73
5. Dasgupta M, Dumbrell AC. Preoperative risk assessment for delirium after noncardiac surgery: a systematic review. J Am Geriatr Soc 2006;54:1578–89
6. Marcantonio ER, Goldman L, Mangione CM, Ludwig LE, Muraca B, Haslauer CM, Donaldson MC, Whittemore AD, Sugarbaker DJ, Poss R, Haas S, Cook EF, Orav EJ, Lee TH. A clinical prediction rule for delirium after elective noncardiac surgery. JAMA 1994;271:134–9
7. Kalisvaart KJ, Vreeswijk R, de Jonghe JFM, van der Ploeg T, van Gool WA, Eikelenboom P. Risk factors and prediction of postoperative delirium in elderly hip-surgery patients: implementation and validation of a medical risk factor model. J Am Geriatr Soc 2006;54:817–22
8. Culley DJ, Monk TG, Crosby GJ. Postoperative central nervous system dysfunction. In: Silverstein JH, Rooke GA, Reves JG, McLeskey CH eds. Geriatric Anesthesiology. 2nd ed. New York: Springer, 2008:123–36
9. Greene NH, Attix DK, Weldon BC, Smith PJ, McDonagh DL, Monk TG. Measures of executive function and depression identify patients at risk for postoperative delirium. Anesthesiology 2009;110:788–95
10. Smith PJ, Attix DK, Weldon BC, Greene NH, Monk TG. Executive function and depression as independent risk factors for postoperative delirium. Anesthesiology 2009;110:781–7
11. Murray AM, Levkoff SE, Wetle TT, Beckett L, Cleary PD, Schor JD, Lipsitz LA, Rowe JW, Evans DA. Acute delirium and functional decline in the hospitalized elderly patient. J Gerontol 1993;48:M181–6
12. Inouye SK, Rushing JT, Foreman MD, Palmer RM, Pompei P. Does delirium contribute to poor hospital outcomes? A three-site epidemiologic study. J Gen Intern Med 1998;13:234–42
13. O'Keeffe S, Lavan J. The prognostic significance of delirium in older hospital patients. J Am Geriatr Soc 1997;45:174–8
14. Cole MG. Delirium in elderly patients. Am J Geriatr Psychiatry 2004;12:7–21
15. McCusker J, Cole M, Dendukuri N, Belzile E, Primeau F. Delirium in older medical inpatients and subsequent cognitive and functional status: a prospective study. CMAJ 2001;165:575–83
16. Folstein MF, Folstein SE, McHugh PR. “Mini-mental state”: a practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res 1975;12:189–98
17. Grober E, Sliwinski M. Development and validation of a model for estimating premorbid verbal intelligence in the elderly. J Clin Exp Neuropsychol 1991;13:933–49
18. Rey A. Psychological examination in cases of traumatic encephalopathy. Arch Psychol 1941;28:286–340
19. Harris ME, Ivnik RJ, Smith GE. Mayo's Older Americans Normative Studies: expanded AVLT Recognition Trial norms for ages 57 to 98. J Clin Exp Neuropsychol 2002;24:214–20
20. Steinberg BA, Bieliauskas LA, Smith GE, Ivnik RJ, Malec JF. Mayo's Older Americans Normative Studies: Age- and IQ-Adjusted Norms for the Auditory Verbal Learning Test and the Visual Spatial Learning Test. Clin Neuropsychol 2005;19:464–523
21. Benton AL. Development of a multilingual aphasia battery: progress and problems. J Neurol Sci 1969;9:39–48
22. Steinberg BA, Bieliauskas LA, Smith GE, Ivnik RJ. Mayo's Older Americans Normative Studies: Age- and IQ-Adjusted Norms for the Trail-Making Test, the Stroop Test, and MAE Controlled Oral Word Association Test. Clin Neuropsychol 2005;19:329–77
23. Stroop JR. Studies of interference in serial verbal reactions. J Exp Psychol 1935;18:643–62
24. Radloff LS. The CES-D scale: a self-report depression scale for research in the general population. Appl Psychol Measure 1977;1:385–401
25. Ewing JA. Detecting alcoholism: the CAGE questionnaire. JAMA 1984;252:1905–7
26. Goldman L, Hashimoto B, Cook EF, Loscalzo A. Comparative reproducibility and validity of systems for assessing cardiovascular functional class: advantages of a new specific activity scale. Circulation 1981;64:1227–34
27. Katz S, Ford AB, Moskowitz RW, Jackson BA, Jaffe MW. Studies of illness in the aged: the index of ADL—a standardized measure of biological and psychosocial function. JAMA 1963;185:914–9
28. Lawton MP, Brody EM. Assessment of older people: self-maintaining and instrumental activities of daily living. Gerontologist 1969;9:179–86
29. Inouye SK, van Dyck CH, Alessi CA, Balkin S, Siegal AP, Horwitz RI. Clarifying confusion: the confusion assessment method—a new method for detection of delirium. Ann Intern Med 1990;113:941–8
30. Sauerbrei W, Schumacher M. A bootstrap resampling procedure for model building: application to the Cox regression model. Stat Med 1992;11:2093–109
31. Newman MF, Kirchner JL, Phillips-Bute B, Gaver V, Grocott H, Jones RH, Mark DB, Reves JG, Blumenthal JA. Longitudinal assessment of neurocognitive function after coronary-artery bypass surgery. N Engl J Med 2001;344:395–402
32. Cohen J. Statistical Power Analysis for the Behavorial Sciences. Hillsdale, NJ: Erlbaum Associates, 1988
33. Inouye SK, Viscoli CM, Horwitz RI, Hurst LD, Tinetti ME. A predictive model for delirium in hospitalized elderly medical patients based on admission characteristics. Ann Intern Med 1993;119:474–81
34. Veliz-Reissmuller G, Aguero Torres H, van der Linden J, Lindblom D, Eriksdotter Jonhagen M. Pre-operative mild cognitive dysfunction predicts risk for post-operative delirium after elective cardiac surgery. Aging Clin Exp Res 2007;19:172–7
35. Rudolph JL, Jones RN, Grande LJ, Milberg WP, King EG, Lipsitz LA, Levkoff SE, Marcantonio ER. Impaired executive function is associated with delirium after coronary artery bypass graft surgery. J Am Geriatr Soc 2006;54:937–41
36. Gruber-Baldini AL, Zimmerman S, Morrison RS, Grattan LM, Hebel JR, Dolan MM, Hawkes W, Magaziner J. Cognitive impairment in hip fracture patients: timing of detection and longitudinal follow-up. J Am Geriatr Soc 2003;51:1227–36
37. Marcantonio ER, Flacker JM, Michaels M, Resnick NM. Delirium is independently associated with poor functional recovery after hip fracture. J Am Geriatr Soc 2000;48:618–24
38. Silverstein JH, Timberger M, Reich DL, Uysal S. Central nervous system dysfunction after noncardiac surgery and anesthesia in the elderly. Anesthesiology 2007;106:622–8
39. Monk TG, Weldon BC, Garvan CW, Dede DE, van der Aa MT, Heilman KM, Gravenstein JS. Predictors of cognitive dysfunction after major noncardiac surgery. Anesthesiology 2008;108:18–30
40. Rudolph JL, Marcantonio ER, Culley DJ, Silverstein JH, Rasmussen LS, Crosby GJ, Inouye SK. Delirium is associated with early postoperative cognitive dysfunction. Anaesthesia 2008;63:941–7
41. Bierman EJ, Comijs HC, Jonker C, Beekman AT. Effects of anxiety versus depression on cognition in later life. Am J Geriatr Psychiatry 2005;13:686–93
42. Whalley LJ, Deary IJ, Appleton CL, Starr JM. Cognitive reserve and the neurobiology of cognitive aging. Ageing Res Rev 2004;3:369–82
43. Moller JT, Cluitmans P, Rasmussen LS, Houx P, Rasmussen H, Canet J, Rabbitt P, Jolles J, Larsen K, Hanning CD, Langeron O, Johnson T, Lauven PM, Kristensen PA, Biedler A, van Beem H, Fraidakis O, Silverstein JH, Beneken JE, Gravenstein JS. Long-term postoperative cognitive dysfunction in the elderly ISPOCD1 study. ISPOCD investigators. International Study of Post-Operative Cognitive Dysfunction. Lancet 1998;351:857–61
44. Hart RP, Martelli MF, Zasler ND. Chronic pain and neuropsychological functioning. Neuropsychol Rev 2000;10:131–49
45. Heyer EJ, Sharma R, Winfree CJ, Mocco J, McMahon DJ, McCormick PA, Quest DO, McMurtry JG III, Riedel CJ, Lazar RM, Stern Y, Connolly ES Jr. Severe pain confounds neuropsychological test performance. J Clin Exp Neuropsychol 2000;22:633–9
46. Lynch EP, Lazor MA, Gellis JE, Orav J, Goldman L, Marcantonio ER. The impact of postoperative pain on the development of postoperative delirium. Anesth Analg 1998;86:781–5
47. Hachinski V. Shifts in thinking about dementia. JAMA 2008;300:2172–3
APPENDIX: SUPPLEMENTAL ONLINE FIGURE LEGENDS
Figures 1–5 show the distributions of the changes between preoperative and 3-month postoperative neurocognitive test scores for patients with postoperative delirium (POD) and their matched controls. The dashed line represents a 1 SD decline from baseline. In all instances, the distributions are approximately normal and there is no difference in their means. Although for each scale there are some patients who decline by >1 SD, those patients do not appear to be outside the distribution. Given this, and literature suggesting that cognitive decline is best treated as a continuous variable,41 we believe that an analysis comparing the mean change between groups is more appropriate than one that defines postoperative cognitive decline as a dichotomous end point. AVLTLE = Learning Efficiency portion of the Auditory Verbal Learning Test; AVLT%R = Percent Retention portion of the Auditory Verbal Learning Test; AVLTDR = Delayed Recall portion of the Auditory Verbal Learning Test; COWAT = Controlled Word Association Test; SCWT = Stroop Color-Word Test.