Where Are We Now?
In their article, Wyles et al. have compared different THA bearing surfaces in terms of probability of revision using network meta-analysis (NMA) . NMA is appealing because it allows the comparison of multiple healthcare interventions for a given condition—even when no direct comparisons (head-to-head trials) exist between some of those interventions—by combining the direct evidence with indirect evidence across randomized trials using the same comparator.
Because of its appeals, NMA is gaining traction; however, NMA remains a relatively new approach, and the papers using it suffer from inconsistent terminology and heterogeneous reporting . Many researchers and surgeons may be unfamiliar with NMA, and its related concepts and assumptions. Crucial assumptions relate to homogeneity among individual trials involving the same comparison, as in classical meta-analysis, but homogeneity also comes into play in NMA where indirect evidence is concerned. Indirect evidence refers to the results arising from the network—the “N” in “NMA”—that allows the comparison of two treatments even when they have not been evaluated directly against one another in head-to-head trials. For example, in a NMA involving treatments A, B, and C, direct evidence for the comparison of treatment A to treatment C refers to trials which compared A to C, and indirect evidence refers to information that can be deduced from trials comparing treatments A to B, and treatments B to C. Issues pertaining to indirect evidence involve both statistical and conceptual aspects. For instance, transitivity implies that there are no differences in treatment-effect modifiers among studies, which may have affected the direct treatment effect estimates used to derive the indirect evidence for a comparison. The related statistical aspect is consistency or coherence, which implies that the direct and indirect treatment effect estimates are the same. Both are closely related, but not identical, and while coherence can be formally tested by a statistical test, transitivity needs a more-empirical appraisal. In addition to providing estimates for all comparisons of interventions, NMA also allows to derive rankings of interventions, or probabilities of each interventions being the most effective. Such probabilities should however be interpreted with caution, as they may be dramatically affected by the addition of a new trial to the network .
Where Do We Need To Go?
As correctly discussed by the authors, the NMA in the current study pointed out several shortcomings in the existing orthopaedic literature on bearing surfaces. These gaps point to potentially fruitful future areas of inquiry. For example, there were only few large randomized controlled trials (RCTs), and several comparisons had not been made—there was no RCT comparing ceramic-on-crosslinked polyethylene against metal-on-crosslinked polyethylene. Also, most trials included treated revision as a binary outcome and not a time-to-event outcome despite high loss to followup rates, none reported safety outcomes, and there was a lack of other relevant outcome such as function scores. The resulting NMA primarily shows that the treatment effect estimates obtained are quite imprecise, and all demonstrated wide credible intervals (a Bayesian analogue to confidence intervals, since the authors used Bayesian NMA). All probabilities of being ranked most effective were likewise estimated very imprecisely. Therefore, a reader should exercise caution in interpreting these estimates.
As more-generally reported, NMAs also proved quite sensitive to the exclusion of a particular branch of the network (RCTs comparing ceramic-on-ceramic to ceramic-on-conventional polyethylene). Although the authors found the direct-comparison meta-analyses to be homogeneous, and to have no evidence of incoherence, they did not look further at transitivity, which is the absence of differences in study characteristics that may have modified the direct evidence used to form the indirect assessment of a treatment effect (the A versus B and B versus C trials used to derive the comparison of A to C). This is an important issue. First, there are several comparisons for which no direct evidence exists, which makes it impossible to compare direct and indirect effect estimates, even though it remains possible to assess whether the trials differed in terms of patient characteristics or settings. This may also have shown whether the RCTs of ceramic-on-ceramic versus ceramic-on-conventional polyethylene were comparable or not with the others in those aspects. Last, if by integrating a larger body of evidence NMA may allow to increase power and precision, it does not seem here that it was the case for the two main comparisons of interest, for which NMA estimates did not seem more precise than estimates provided by direct-comparison meta-analysis. To provide a more-precise answer to the comparison of current THA bearing surfaces, we still need more reliable evidence from primary-source studies: RCTs.
How Do We Get There?
We need to improve the way RCTs comparing THA-bearing surfaces are planned, conducted, and reported; the same likely applies to interventions in orthopaedic surgery. For a more useful NMA, we also need to improve the geometry of the networks. For instance, there were eight RCTs comparing metal-on-conventional polyethylene to metal-on-crosslinked polyethylene, and 11 RCTs comparing ceramic-on-ceramic to other bearings. If the comparison of ceramic-on-highly-crosslinked polyethylene and metal-on-highly-crosslinked polyethylene is an important one to make, then there is certainly a need for such direct trials. It has been shown that well-conducted prospective nonrandomized studies with adequate propensity scores analysis could be relied upon as evidence in surgery . Since large size RCTs seem to be difficult to conduct in THA, NMAs on certain topics may also be reliably informed by adding the results of such prospective nonrandomized studies when they exist.
Lastly, there are currently no formal guidelines for reporting NMA. Given the complexity of the methodology, they would certainly be helpful to guide authors as well as readers. In that respect, a four-step approach to rate the quality of evidence in each of the direct, indirect, and NMA estimates, with the aim to select the most reliable estimates has been recently proposed . This approach is an important step forward to help readers who have no specialized training in methodology to understand the results of a NMA.
1. Bafeta A, Trinquart L, Seror R, Ravaud P. Reporting of results from network meta-analyses: Methodological systematic review. BMJ.
2. Lonjon G, Boutron I, Trinquart L, Ahmad N, Aim F, Nizard R, Ravaud P. Comparison of treatment effect estimates from prospective nonrandomized studies with propensity score analysis and randomized controlled trials of surgical procedures. Ann Surg.
3. Mills EJ, Ioannidis JP, Thorlund K, Schunemann HJ, Puhan MA, Guyatt GH. How to use an article reporting a multiple treatment comparison meta-analysis. JAMA.
4. Puhan MA, Schünemann HJ, Murad MH, Li T, Brignardello-Petersen R, Singh JA, Kessels AG, Guyatt GH, GRADE Working Group. A GRADE Working Group approach for rating the quality of treatment effect estimates from network meta-analysis. BMJ.
5. Wyles CC, Jienez-Almonte JH, Murad MH, Norambuena-Morales GA, Cabanela ME, Sierra RF, Trousdale RT. There are no differences in short- to midterm survivorship among total hip-bearing surface options: A network meta-analysis. [Published online ahead of print December 17, 2014]. Clin Orthop Relat Res
. DOI: 10.1007/s11999-014-4065.