A study published in 2001 reported that sample sizes in the randomized controlled trials (RCTs) published in major orthopaedic journals in 1997 were too small, resulting in low power to detect reasonable effect sizes. Low power is the fundamental reason for the poor reproducibility of research findings and serves to erode a cornerstone of the scientific method. The aim of this study was to ascertain whether improvements have been made in orthopaedic research during the past 2 decades.
The electronic table of contents from the 2016 and 2017 volumes of 7 major orthopaedic journals were searched issue by issue in chronological order to identify possible RCTs. A posteriori (after-the-fact) power to detect small, medium, and large effect sizes, defined by the Cohen d value, were calculated from the sample sizes reported in the studies. The power to detect effect sizes associated with the most commonly used patient-reported outcome measures (PROMs) was also calculated. Finally, the use of a priori power analysis in the included studies was assessed.
In total, 233 studies were included in the final analyses. None of the negative studies had sufficient power (≥0.80) to detect a small effect size. Only between 15.0% and 32.1% of the negative studies had adequate power to detect a medium effect size. When categorized by anatomic region, 0% to 52.6% had adequate power to detect an effect size corresponding to the minimal clinically important difference (MCID). An a priori power analysis was employed in 196 (84%) of the 233 studies. However, the power analysis could not be replicated in 46% of the studies that used a mean comparison.
Although small improvements in orthopaedic RCTs have occurred during the past 2 decades, many RCTs are still underpowered: the sample sizes are still too small to have adequate power to detect what would be deemed clinically relevant.