Secondary Logo

Journal Logo

Editorials: Editorial

Comparing Apples to Oranges

Just Say No to N2O?

Vetter, Thomas R. MD, MPH; McGwin, Gerald Jr MS, PhD

Author Information
doi: 10.1213/ANE.0b013e31826e7632
  • Free

Nitrous oxide (N2O) was first isolated by Joseph Priestly in 1772 and was recognized as having analgesic properties by Humphrey Davy in 1799, based in part on his self-treatment of a toothache.1 Davy prophetically noted that because the agent “appears capable of destroying physical pain, it may probably be used with advantage during surgical operations.”2,3 However, early and mid-19th century human use of N2O was more often as a recreational drug at the then common “laughing gas” parties and public demonstrations in theaters, music halls, and carnivals.4 Despite Horace Wells’ unsuccessful initial public administration of N2O in 1845 at Massachusetts General Hospital, beginning in the 1860s, Gardner Quincy Colton and his colleagues subsequently mastered and promoted its widespread clinical use in America and Europe.5 In the intervening 150 years, N2O has remained a widely used inhaled anesthetic agent, which has been ostensibly administered safely to several billion surgical patients worldwide.6,7

However, the use of N2O has not been without problems and controversy. The metabolic effects of N2O are centered on the inactivation of cobalamin (vitamin B12) as a coenzyme of methionine synthase.8 This inhibition of methionine synthase by N2O increases plasma homocysteine.9 Patient and occupational exposures to N2O have been associated with a variety of adverse effects, including endothelial dysfunction,8,9 leading some to question the continued clinical viability of N2O.10,11 To date, the Evaluation of Nitrous Oxide in the Gas Mixture for Anesthesia (ENIGMA) trial group has reported perhaps the most troubling clinical findings.12,13

In the ENIGMA-I trial, a cohort of 2050 patients having major surgery, across 19 participating centers between April 2003 and June 2004, were randomized to receive N2O-free (80% oxygen, 20% nitrogen) or N2O-based (70% N2O, 30% oxygen) anesthesia.12 The initial primary ENIGMA outcome was duration of hospital stay, with secondary outcomes being the duration of intensive care, the occurrence of postoperative pneumonia, pneumothorax, pulmonary embolism, wound infection, myocardial infarction, venous thromboembolism, or stroke, and death within 30 days of surgery. No significant difference in duration of hospital stay was observed between the 2 study groups. Compared with patients in the N2O-based group, patients in the N2O-free group were less likely to have at least 1 pulmonary complication (7.8% vs 13.0%; adjusted odds ratio of 0.54, 95% confidence interval [CI], 0.40–0.74; P < 0.001) and less likely to have at least 1 major complication (16% vs 21%; adjusted odds ratio of 0.71, 95% CI, 0.56–0.89; P = 0.003).12 The N2O-associated number needed to harm (NNH) for a pulmonary complication was 19.2, and the NNH for any major complication was 20.

The ENIGMA trial group subsequently conducted a follow-up (median, 3.5 years) of their initial cohort of 2050 patients to evaluate survival and the risk of myocardial infarction or stroke.13 Exposure to N2O did not significantly increase the risk of death (hazard ratio 0.98, 95% CI, 0.80–1.20; P = 0.82). In patients administered N2O, the adjusted odds ratio for myocardial infarction was 1.59 (95% CI, 1.01–2.51; P = 0.04) and for stroke was 1.01 (95% CI, 0.55–1.87; P = 0.97). Of note, based on the observed 4% vs 5% incidence, the N2O-associated NNH for a myocardial infarction during follow-up was 100.

Turan et al.14 report in this month’s issue of the journal on their cohort study of 37,609 adults who underwent noncardiac surgery at the Cleveland Clinic between 2005 and 2009. The primary outcomes of their study were all-cause 30-day mortality and a set of major in-hospital complications, including neurologic, cardiac, pulmonary/respiratory, infectious, urinary and hemorrhagic, wound disruption, and peripheral vascular complications all based solely on inpatient ICD-9 coding for billing. Receiving N2O was associated with decreased 30-day mortality (odds ratio of 0.65, 97.5% CI, 0.45–0.94; P = 0.01). Intraoperative administration of N2O was associated with decreased pulmonary/respiratory morbidity (odds ratio of 0.55, 95% CI, 0.41–0.73; P < 0.001). No differences were observed for the other 7 types of complications.

Turan et al.14 use the ENIGMA-I trial as the backdrop for the interpretation of their own results and, on its face, this is reasonable. However, there are a number of methodological issues that must be considered when attempting to interpret their results as well as to rectify any differences with those of ENIGMA-I. First and foremost, the study by Turan et al.14 is observational in nature, and as such the decision to administer N2O or not was not a random process. Rather, it was based on a complex, and perhaps unquantifiable set of patient, physician, environmental, and perhaps other characteristics, many of which are also associated with postoperative morbidity and mortality. Any observed effect of N2O may be biased by the confounding influence of these characteristics. Studies such as ENIGMA-I minimize such bias by using randomization, which fosters equipoise with respect to these characteristics, including, importantly so, those that cannot be quantified. Although randomized-controlled trials remain the “gold standard,” it has been demonstrated that well-designed observational studies can also reach similar valid conclusions,15 the devil being in the details of well-designed studies.

Propensity scores have been used to bridge the gap between observational and randomized studies. Each study participant’s probability (i.e., propensity) of receiving or not receiving a treatment is calculated as a function of measured covariates, typically using a logistic regression model. These scores are then used, via several potential mechanisms, to minimize the bias associated with the lack of randomization. Turan et al.14 used propensity scores to match a subset of 10,755 N2O patients with 10,755 non-N2O patients with the goal of ensuring that the 2 groups were evenly distributed with respect to the demographic and clinical characteristics that were used to compute the propensity score.

The statistical principles underlying propensity scores have been firmly established, and their use has increased dramatically in the clinical literature over the past 2 decades.16 Compared with multivariable adjustment, they have often been viewed as a superior approach to address the problem of confounding. The reality is that they are not a panacea; rather, they are only superior in situations that favor their use, and there is little evidence that propensity scores produce substantially different results compared with conventional multivariable methods.17,18 Although this finding may be due to publication bias, it can surely be attributed to the fact that propensity scores are not going to solve the problem of unmeasured confounders any better than traditional multivariable adjustment, a fact that will likely leave the bridge between observational and randomized studies forever incomplete.

As the use of propensity scores has flourished, a literature has emerged that provides practical and understandable guidance regarding their use as well as their associated pitfalls.16–20 An area where the present study by Turan et al. is deficient relates to the propensity score variable selection and modeling process, which is critical for properly evaluating their validity. We can be somewhat reassured in this regard as the authors have demonstrated that, with some exceptions, the N2O and non-N2O groups are comparable with respect to the potential confounders. However, this was also the case in their larger population, which may have overcome any model misspecification. Another deficiency relates to the recommendation that the results of both multivariable and propensity score-based approaches be presented, results that Turan et al. do not provide. Lack of such results deprives the reader of valuable information necessary for properly interpreting the study’s findings. As a result, we do not know whether the observed results are generalizable to the entire study population or just to those who were able to be matched, which was only 43% of the entire cohort. Last, it has been demonstrated in the setting of cardiovascular mortality that propensity scores developed using administrative data (like those of Turan et al.) do not necessarily balance patient characteristics contained in the actual corresponding clinical data, and that the use of such administrative data may result in an overestimation of the effectiveness of therapy.21

However, such issues related to the use of propensity scores may not shed much light on the disagreement between the results reported by Turan et al.14 and those of the ENIGMA-I trial.12,13 There is more to be gained by recognizing that the 2 study populations are likely vastly different. Patients in the ENIGMA-I trial were expected to remain hospitalized for 3 days; the median duration of stay was approximately 7 days. However, the study population in the study by Turan et al. included outpatients and minor procedures, a fact noted in the Discussion of the article. The authors do not provide the reader with information regarding the distribution of inpatients versus outpatients nor the average length of stay for their study population. This is not an insignificant omission for a number of reasons. The mortality and morbidity risk is lower in the study cohort of Turan et al. compared with the ENIGMA-I cohort, and in such lower-risk patients, the effects of N2O may be different. Also, one of the primary outcomes in the study by Turan et al. is major in-hospital complications. For outpatients, the time-at-risk is truncated compared with that of inpatients. Thus, the authors’ use of odds ratios rather than a rate-based measure of association (e.g., hazard ratios) may obscure the effect of N2O on the incidence of the study outcomes, particularly if the time-at-risk differs between the treatment groups.

There has been a significant increase in such retrospective analyses of existing, secondary data to assess both immediate and more longitudinal perioperative outcomes. Such perioperative, observational studies have been facilitated by the development of large-scale databases at a single institution (e.g., Cleveland Clinic or Mayo Clinic) or involving multiple centers (e.g., Surgical Care Improvement Project, National Anesthesia Clinical Outcomes Registry, Multicenter Perioperative Outcomes Group, and National Center for Clinical Outcomes Research). Although such large secondary databases are understandably attractive to researchers seeking to answer questions requiring robust patient samples, discipline and diligence are needed to avoid data mining and generating statistically but not clinically significant results. In response to these concerns, the STrengthening the Reporting of OBservational studies in Epidemiology (STROBE) initiative has developed and widely promulgated recommendations on what should be included in an accurate and complete report of an observational study (,23 The editorial board of Anesthesia & Analgesia refers to the STROBE statement in its current “Guide for Authors.”24 Of note, Turan et al.14 fulfilled the STROBE checklist of items that should be included in such a report of a cohort study.

A growing body of evidence supports the avoidance of N2O in pediatric and adult patients,10,11,25 including from a cost-benefit perspective.26 However, the multitude of patients who have been exposed to N2O without apparent complications would suggest that further studies are needed on its longer-term side effects. The contrary findings presented here by Turan et al.14 are difficult to rectify with those of ENIGMA-I because of the very discordant study populations, something that propensity scores cannot solve. Turan et al. would have a more valid counterpoint if they had performed a sensitivity analysis limited to a study population similar to ENIGMA-I. Additional, perhaps more definitive insight will be gained by the current ENIGMA-II trial, which is reportedly prospectively enrolling 7000 noncardiac surgery patients, at intermediate or high risk for coronary artery disease, to assess the N2O-associated risk of death and major nonfatal events (myocardial infarction, cardiac arrest, pulmonary embolism, and stroke) 30 days postoperatively.27


Name: Thomas R. Vetter, MD, MPH.

Contribution: This author helped write the manuscript.

Attestation: Thomas R. Vetter approved the final manuscript.

Name: Gerald McGwin, Jr., MS, PhD.

Contribution: This author helped write the manuscript.

Attestation: Gerald McGwin, Jr. approved the final manuscript.

This manuscript was handled by: Sorin J. Brull, MD, FCARCSI (Hon).


1. Smith WDA history of nitrous oxide and oxygen anaesthesia. . I. Joseph Priestley to Humphry Davy. Br J Anaesth. 1965;37:790–8
2. Davy H Researches, Chemical and Philosophical; Chiefly Concerning Nitrous Oxide, or Dephlogisticated Nitrous Air, and Its Respiration. 1800 London, UK J. Johnson
3. Desai SP, Desai MS, Pandav CS. The discovery of modern anaesthesia—contributions of Davy, Clarke, Long, Wells and Morton. Indian J Anaesth. 2007;51:472–8
4. Jay M.. Nitrous oxide: recreational use, regulation and harm reduction. Drugs Alcohol Today. 2008;8:22–25
5. Smith GB, Hirsch NP. Gardner Quincy Colton: pioneer of nitrous oxide anesthesia. Anesth Analg. 1991;72:382–91
6. Fleischmann E, Lenhardt R, Kurz A, Herbst F, Fülesdi B, Greif R, Sessler DI, Akça OOutcomes Research Group. . Nitrous oxide and risk of surgical wound infection: a randomised trial. Lancet. 2005;366:1101–7
7. Hopkins PM. Nitrous oxide: a unique drug of continuing importance for anaesthesia. Best Pract Res Clin Anaesthesiol. 2005;19:381–9
8. Sanders RD, Weimann J, Maze M. Biologic effects of nitrous oxide: a mechanistic and toxicologic review. Anesthesiology. 2008;109:707–22
9. Myles PS, Chan MT, Kaye DM, McIlroy DR, Lau CW, Symons JA, Chen S. Effect of nitrous oxide anesthesia on plasma homocysteine and endothelial function. Anesthesiology. 2008;109:657–63
10. Hopf HW. Is it time to retire high-concentration nitrous oxide? Anesthesiology. 2007;107:200–1
11. Myles PS, Leslie K, Silbert B, Paech MJ, Peyton P. A review of the risks and benefits of nitrous oxide in current anaesthetic practice. Anaesth Intensive Care. 2004;32:165–72
12. Myles PS, Leslie K, Chan MT, Forbes A, Paech MJ, Peyton P, Silbert BS, Pascoe EENIGMA Trial Group. . Avoidance of nitrous oxide for patients undergoing major surgery: a randomized controlled trial. Anesthesiology. 2007;107:221–31
13. Leslie K, Myles PS, Chan MT, Forbes A, Paech MJ, Peyton P, Silbert BS, Williamson E. Nitrous oxide and long-term morbidity and mortality in the ENIGMA trial. Anesth Analg. 2011;112:387–93
14. Turan A, Mascha EJ, You J, Kurz A, Shiba A, Saager L, Sessler DI. The Association Between Nitrous Oxide and Postoperative Mortality and Morbidity After Noncardiac Surgery. Anesth Analg. 2013;116:1026–33
15. Benson K, Hartz AJ. A comparison of observational studies and randomized, controlled trials. N Engl J Med. 2000;342:1878–86
16. Austin PC. A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003. Stat Med. 2008;27:2037–49
17. Shah BR, Laupacis A, Hux JE, Austin PC. Propensity score methods gave similar results to traditional regression modeling in observational studies: a systematic review. J Clin Epidemiol. 2005;58:550–9
18. Stürmer T, Joshi M, Glynn RJ, Avorn J, Rothman KJ, Schneeweiss S. A review of the application of propensity score methods yielded increasing use, advantages in specific settings, but not substantially different estimates compared with conventional multivariable methods. J Clin Epidemiol. 2006;59:437–47
19. Weitzen S, Lapane KL, Toledano AY, Hume AL, Mor V. Principles for modeling propensity scores in medical research: a systematic literature review. Pharmacoepidemiol Drug Saf. 2004;13:841–53
20. Blackstone EH. Comparing apples and oranges. J Thorac Cardiovasc Surg. 2002;123:8–15
21. Austin PC, Mamdani MM, Stukel TA, Anderson GM, Tu JV. The use of the propensity score for estimating treatment effects: administrative versus clinical data. Stat Med. 2005;24:1563–78
22. Vandenbroucke JP, von Elm E, Altman DG, Gøtzsche PC, Mulrow CD, Pocock SJ, Poole C, Schlesselman JJ, Egger MSTROBE Initiative. . Strengthening the Reporting of Observational Studies in Epidemiology (STROBE): explanation and elaboration. Epidemiology. 2007;18:805–35
23. von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JPSTROBE Initiative. . The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Epidemiology. 2007;18:800–4
24. Editorial Board, Anesthesia & Analagesia. . Guide for authors. Anesth Analg. 2010;111:525–38
25. Baum VC. When nitrous oxide is no laughing matter: nitrous oxide and pediatric anesthesia. Paediatr Anaesth. 2007;17:824–30
26. Graham AM, Myles PS, Leslie K, Chan MT, Paech MJ, Peyton P, El Dawlatly AA. A cost-benefit analysis of the ENIGMA trial. Anesthesiology. 2011;115:265–72
27. Myles PS, Leslie K, Peyton P, Paech M, Forbes A, Chan MT, Sessler D, Devereaux PJ, Silbert BS, Jamrozik K, Beattie S, Badner N, Tomlinson J, Wallace SANZCA Trials Group. . Nitrous oxide and perioperative cardiac morbidity (ENIGMA-II) Trial: rationale and design. Am Heart J. 2009;157:488–494.e1
© 2013 International Anesthesia Research Society