Annals of Surgery Journal Club

Interactive resource for surgery residents and surgeons to discuss and critically evaluate articles published in Annals of Surgery selected by a monthly guest expert who will review an article each month, offer questions and respond to reader's comments.

Friday, July 5, 2013

July, 2013 Journal Club

Moderator: Dr. John Birkmeyer

Article: Hospital Procedure Volume Should Not Be Used as a Measure of Surgical Quality. LaPar, Damien J.; Kron, Irving L.; Jones, David R.; Stukenborg, George J.; Kozower, Benjamin D. Annals of Surgery. 256(4):606-615, October 2012.

Article summary: As studies assessing relationships between procedure volume and surgical mortality with administrative data have become ubiquitous, catching readers’ attention in this area has become increasingly difficult.  With this Annals article, LaPar et al. have done just that.  Based on their analysis of data from the 2008 Nationwide Inpatient Sample, the authors found that hospital volume was a not significant predictor of surgical mortality for 4 oft-studied procedures.  The null findings are not too surprising for coronary artery bypass graft surgery, for which volume effects have always been small and seem to be declining over time.  But the absence of a measurable hospital volume effect for pancreatectomy, esophagectomy, and (to a lesser extent) AAA repair bucks against the prevailing consensus about the importance of experience with these procedures.

Have all the volume-outcome studies that have come before gotten it wrong?  LaPar et al. suggest—correctly—that many previous studies have examined the effect of hospital volume using unsophisticated analytical methods that failed to account for clustering, non-linear volume effects, and other statistical issues.  Nonetheless, the largest and most influential studies have generally employed multi-level, hierarchical models analogous to those employed by the authors in their paper.  While many of those studies have used volume categories (e.g., quintiles) to summarize their results, their statistical testing has been conducted on continuous measures of volume.  Further supporting the importance of hospital volume in pancreatectomy and esophagectomy, a 2011 study by Finks et al. published in the New England Journal of Medicine documented that declines in surgical mortality between 1999 and 2008 were directly attributed to market concentration (and increasing hospital volumes).

Maybe hospital volume matters less now than it did before?  As operative mortality with high risk procedures has declined over the past decade, absolute differences in mortality rates between high and low volume hospitals have no doubt shrunk.  Still, we do not believe that performance disparities have disappeared.  We examined the same dataset used by the authors (2008 Nationwide Inpatient Sample) and were unable to reproduce their conclusions.  For both pancreatectomy and esophagectomy, risk-adjusted mortality rates at low volume hospitals (based on Leapfrog cutpoints) were approximately twice as high as at high volume hospitals.

Assuming the absence of basic coding errors, the study’s findings can be most likely traced to small sample sizes and over-fitted, unstable statistical models.  The authors relied on a single year of NIS data, which is in turn only a 20% sample of cases performed nationwide.  For uncommon procedures like pancreatectomy and esophagectomy, sample sizes from that dataset often will not support the complex modeling strategies described by the authors.  The inconsistent, far-ranging coefficients reported in the main findings of the paper (Table 2) are suspicious for models that failed to “converge” and are thus uninterpretable.  The authors may have uncovered that problem had they started with simpler, more transparent analytic strategies.

With complicated datasets and complicated models, it is always important to check your assumptions when the model’s results are surprising or even “too good to be true.”



1.       This study was based on the Nationwide Inpatient Sample.  What are the strengths and weakness of using the NIS, versus other large administrative datasets like national Medicare claims or data from Healthcare Utilization Project?


2.       In their article, the authors emphasize the importance of accounting for “clustering” of patients within hospitals, and hospitals within volume strata.  What is clustering?  What are the potential implications of failing to account for clustering in terms of statistical power?  In terms of bias?


3.       The conclusions of this article contradict a large number of previous studies documenting associations between hospital volume and mortality with high risk procedures.  What factors should be considered in determining “who’s right” when the results of different studies conflict?

Please feel free to comment on any or all of the questions above. We look forward to hearing from you, the Annals readers.