Instrumental variable analysis can potentially circumvent confounding by indication that exists because of unknown or poorly recorded factors in observational data of anticipated therapy effects.1 Physician’s prescribing preference is a promising instrument, because differences among physicians in therapy preferences are ubiquitous.
We used anesthesiologist’s preference in an instrumental variable analysis to investigate whether preoperative high-dose corticosteroids are beneficial in cardiac surgery patients because they suppress the procedure-induced inflammatory response.2,3 We compared instrumental variable analyses to standard regression techniques and also to results from the recent Dexamethasone for Cardiac Surgery randomized trial.4
We used clinical data collected in the context of routine clinical care. The Leiden University Medical Centre review board waived the need of formal ethical approval and written informed consent.
We assessed data on all adult patients who underwent elective cardiac surgery in the Leiden University Medical Centre in 2005. Patients had undergone a range of interventions, including coronary artery bypass grafting, valve repair/replacement, and heart failure surgery. Patients treated with corticosteroids before admission for cardiac surgery were excluded, leaving 476 patients, of whom 115 received prophylactic corticosteroids. All received regular care according to the fast track protocol.5
Study End Points
Data on demographic features, type of surgical intervention and EuroSCORE were extracted from electronic and paper patient records. The EuroSCORE is a validated prognostic score of in-hospital mortality, based on patient-related, cardiac-related, and operation-related factors.6,7 Primary endpoints were 30-day mortality, ventilation time, and durations of intensive care unit (ICU) and hospital stays. Secondary outcomes were atrial fibrillation, infections, heart failure, delirium, norepinephrine use, glucose, and leukocyte count
We first used linear regression to estimate the effect of corticosteroids on the outcomes. This included crude analyses, multivariable analyses (adjusting for age, sex, diabetes, EuroSCORE, and type of surgery), and propensity score-adjusted analyses (including the variables in the multivariable model plus the surgeon). Next, we performed 2-stage least squares instrumental variable analysis, with robust standard errors for dichotomous outcomes. The instrument was the proportion of all earlier patients of the same anesthesiologist who received corticosteroids. We selected this instrument based on the first-stage F-statistic and partial r2 and on the range of predicted treatment probabilities. IV analyses were based on 461 patients (excluding 3 patients with unknown anesthesiologist, the only 2 patients of 1 anesthesiologist, and all first patients of the 10 anesthesiologists). Instrumental variable assumptions for our study were as follows: (1) anesthesiologist’s preference affects the probability that a patient receives corticosteroids; (2) anesthesiologist’s preference for corticosteroids does not affect the outcome other than through the decision whether to administer corticosteroids, and (3) anesthesiologist’s preference for corticosteroids is not related to characteristics of his patient population.8,9 The fourth assumption, required to obtain a point estimate,10,11 was the monotonicity assumption: no anesthesiologist would give corticosteroids to a certain patient unless all anesthesiologists with the same or a stronger preference would also give corticosteroids to that patient. The causal effect estimated is a local average treatment effect,11 a weighted average of the treatment effects in patients who would receive corticosteroids from anesthesiologists with a certain preference level, but not from anesthesiologists with a lower preference.10 Statistical analyses were performed with Stata 12 and the extension ivreg2.12
For additional information regarding study population, data-extraction, study endpoints, conventional analyses, instrumental variable analyses, and sensitivity analyses, see the eAppendix (http://links.lww.com/EDE/A813).
Table 1 displays patient characteristics and outcomes according to received treatment. The EuroSCORE was higher in patients who received corticosteroids, suggesting confounding.
For the selected instrument the first-stage F-statistic was 126 and the partial r2 was 0.22 (see eMethods and eTable 1, http://links.lww.com/EDE/A813). Table 2 shows patient characteristics across physician’s preference quintiles. There was no clear pattern across physician’s preference quintiles in EuroSCORE (see eFigure 2 [http://links.lww.com/EDE/A813] for EuroSCORE per anesthesiologist) or other patient characteristics, suggesting physicians’ preference for corticosteroids was not related to differences in patients’ prognosis. Table 2 shows a decreasing pattern across physician’s preference categories for duration of ventilation and infections.
Results of conventional and instrumental variable analyses are displayed in the Figure (dichotomous outcomes only) and eTable 2 (http://links.lww.com/EDE/A813). In general, unadjusted conventional analyses showed poorer outcomes in patients treated with corticosteroids (except for atrial fibrillation, infections, and norepinephrine dose). Multivariable and propensity-score-adjusted analyses generally showed a null effect. Instrumental variable results indicated a decreased risk of adverse outcomes (except atrial fibrillation) after corticosteroid administration. However, confidence intervals of IV estimates were much wider than those of conventional estimates. For example, crude analysis indicated the risk of a ventilation time >11 hours was 3.1% higher (95% confidence interval = −7.8% to 14.1%), propensity-score-adjusted analysis indicated it was 2.0% lower (−12.8% to 8.8%), and instrumental variable analysis indicated it was 28.1% lower −52.4% to −3.9%) for patients who received corticosteroids. Instrumental variable estimates of differences in glucose and leukocyte count were slightly higher than estimates from the other analyses (eTable 2, http://links.lww.com/EDE/A813).
Because of our small sample size, we could compare our results only to secondary outcomes of the Dexamethasone for Cardiac Surgery randomized clinical trial.4 In general, effects in our instrumental variable analyses were similar in direction to the randomized clinical trial results (see Results section of eAppendix, http://links.lww.com/EDE/A813), but with considerably larger effect sizes. For example, whereas our instrumental variable analyses estimated the risk of a ventilation time >24 hours to be 16.3% lower (−33.2% to 0.5%) for patients who received corticosteroids, the randomized clinical trial estimated this difference to be −1.5% (−2.7% to −0.3%).4
Neither adjusting the instrumental variable analysis for patient characteristics, nor using an instrumental variable based on the last 5 patients materially changed the results (eTable 3, http://links.lww.com/EDE/A813). Sensitivity analyses estimating relative risks yielded similar effect sizes (eTable 4, http://links.lww.com/EDE/A813).
We investigated whether physician’s preference-based instrumental variable analysis was valid and useful in a moderate-sized study for the question whether preoperative corticosteroids are beneficial in cardiac surgery. In contrast to crude and propensity score-adjusted analyses, instrumental variable analysis using anesthesiologists’ preferences as an instrument showed beneficial effects, similar in direction to the Dexamethasone for Cardiac Surgery randomized clinical trial results,4 and compatible with pathophysiologic insights concerning prevention of operation-induced systemic inflammation.13–15 However, compared with the trial results, the instrumental variable estimates were extremely large and confidence intervals were so wide as to preclude useful conclusions.
A reason for the difference in magnitude between our instrumental variable estimates and the randomized clinical trial results could be effect modification because of baseline prognostic differences between the study populations. Our patients seemed to be more high risk, as indicated by longer ventilation and ICU stay times and higher incidences of most outcomes.
There are also design-inherent explanations for the large size of the instrumental variable effect estimates. First, our smaller number of patients, compared with the randomized clinical trial, gives rise to less statistical precision, which is further aggravated in the IV analysis because of its 2-stage approach.16 This lack of precision, reflected in the large confidence intervals, could lead to the instrumental variable estimates being more extreme by chance.
Second, main instrumental variable assumptions may be violated. We would not expect differences in patient characteristics depending on anesthesiologist’s preference for corticosteroids (independence assumption), as patients are assigned to the anesthesiologist on duty on the day of surgery. The lack of a consistent pattern in measured patient characteristics across quintiles of the instrumental variable is therefore reassuring. The assumption that preference for corticosteroids does not affect outcomes other than through administration of corticosteroids is more difficult to assess but seems plausible, as anesthesiologists took care of the patients only during surgery and were not involved in subsequent ICU care.
Third, violation of the monotonicity assumption could contribute to the extreme estimates. For example, if patients who receive corticosteroids from an anesthesiologist with a weak preference would not receive them from an anesthesiologist with a strong preference and if corticosteroids are of relatively little benefit to these patients, then the estimate of the effect of corticosteroids would be too favorable.
Fourth, estimands of the conventional and the instrumental variable analyses are different: the conventional analyses estimate average treatment effects in the population, whereas the instrumental variable analyses estimate local average treatment effects (as explained in the Methods section).
Fifth, finite sample bias might be a reason for the large instrumental variable effect estimates. However, the first-stage F-statistic of 126 should be sufficient for finite sample bias to be negligible.1 We further explored this using simulations under conditions similar to our study (100–500 patients; mean partial r2 of 0.17; unmeasured confounding and a binary outcome occurring in 50% of patients; see eAppendix, http://links.lww.com/EDE/A813). Mean instrumental variable estimates were close to the “true” treatment effect of 0.10, even when the sample size was reduced to 100 patients, indicating no substantial finite sample bias with an instrument of this strength.
In conclusion, despite availability of a strong instrument, plausibly fulfilling main instrumental variable assumptions, physician’s preference-based instrumental variable analysis in a moderate-sized study population showed results that differed greatly in magnitude from results of a major randomized clinical trial on the same intervention. We have explored possible reasons and conclude that this phenomenon is most likely because of the reduced statistical precision of the instrumental variable analysis in datasets of moderate size.
We thank Stefan Dieleman of the DECS Study Group for providing additional information on their data, which enabled us to compare our data to data from a randomized clinical trial on the same question. We also thank Michel Versteegh for supplying the EuroSCORE data for the study population.
1. Martens EP, Pestman WR, de Boer A, Belitser SV, Klungel OH. Instrumental variables: application and limitations. Epidemiology. 2006;17:260–267
2. Ng CS, Wan S, Arifi AA, Yim AP. Inflammatory response to pulmonary ischemia-reperfusion injury. Surg Today. 2006;36:205–214
3. Clark SC. Lung injury after cardiopulmonary bypass. Perfusion. 2006;21:225–228
4. Dieleman JM, Nierich AP, Rosseel PM, et al.Dexamethasone for Cardiac Surgery (DECS) Study Group. Intraoperative high-dose dexamethasone for cardiac surgery: a randomized controlled trial. JAMA. 2012;308:1761–1767
5. Silbert BS, Santamaria JD, O’Brien JL, Blyth CM, Kelly WJ, Molnar RR. Early extubation following coronary artery bypass surgery: a prospective randomized controlled trial. The Fast Track Cardiac Care Team. Chest. 1998;113:1481–1488
6. Nashef SA, Roques F, Michel P, Gauducheau E, Lemeshow S, Salamon R. European system for cardiac operative risk evaluation (EuroSCORE). Eur J Cardiothorac Surg. 1999;16:9–13
7. Roques F, Michel P, Goldstone AR, Nashef SA. The logistic EuroSCORE. Eur Heart J. 2003;24:881–882
8. Brookhart MA, Wang PS, Solomon DH, Schneeweiss S. Evaluating short-term drug effects using a physician-specific prescribing preference as an instrumental variable. Epidemiology. 2006;17:268–275
9. Rassen JA, Brookhart MA, Glynn RJ, Mittleman MA, Schneeweiss S. Instrumental variables I: instrumental variables exploit natural variation in nonexperimental data to estimate causal relationships. J Clin Epidemiol. 2009;62:1226–1232
10. Hernán MA, Robins JM. Instruments for causal inference: an epidemiologist’s dream? Epidemiology. 2006;17:360–372
11. Swanson SA, Hernán MA. Commentary: how to report instrumental variable analyses (suggestions welcome). Epidemiology. 2013;24:370–374
12. Baum CF, Schaffer ME, Stillman S ivreg2: tata module for extended instrumental variables/2SLS, GMM and AC/HAC, LIML and k-class regression. 2010
13. Warren OJ, Smith AJ, Alexiou C, et al. The inflammatory response to cardiopulmonary bypass: part 1–mechanisms of pathogenesis. J Cardiothorac Vasc Anesth. 2009;23:223–231
14. Aird WC. The role of the endothelium in severe sepsis and multiple organ dysfunction syndrome. Blood. 2003;101:3765–3777
15. Marshall JC. Inflammation, coagulopathy, and the pathogenesis of multiple organ dysfunction syndrome. Crit Care Med. 2001;29(7 suppl):S99–106
16. Ionescu-Ittu R, Delaney JA, Abrahamowicz M. Bias-variance trade-off in pharmacoepidemiological studies using physician-preference-based instrumental variables: a simulation study. Pharmacoepidemiol Drug Saf. 2009;18:562–571