Sir: The letter from Dr Stubhaug mostly contains criticism of the statistical justification for the conclusions. Accordingly we have sought the assistance of a professional statistician to help us with this reply. Having said that, the important point to grasp is the difference between clinical and statistical significance. The aim of the study was to demonstrate with reasonable confidence that the efficacy of the two drugs was clinically equivalent when used in accordance with the protocol. These terms were defined prospectively and were used to estimate the number of patients required in the study. Clinical equivalence was defined as any difference less than 10% in the proportion of responders at 90 min. It was assumed that any difference smaller than this was not clinically relevant. Reasonable confidence required this to be shown with 90% power.
The observed treatment difference in this primary outcome measure was 8.7%. The 90% and 95% confidence intervals (CI) of this estimate are 2.4-14.6% and 1.3-15.8% respectively. Since the 90% CI spans 10%, the study failed to prove clinical equivalence at the level of confidence specified. With 90% confidence, the true difference lies between 2.4% and 14.6%.
We do not dispute that tramadol was less effective than morphine but that was not the hypothesis. Rather our aim was to estimate with acceptable precision the magnitude of the treatment difference. We suggest that the study has shown that all likely values of the true treatment difference are of no major clinical importance. If Dr Stubhaug disagrees, he should say what difference he thinks is clinically relevant. It is not acceptable to argue that only differences which are statistically significant are clinically relevant.
Incidentally, the P-values quoted in Dr Stubhaug's Table 1 could be misleading if their lack of independence is not acknowledged. The first two, comparing Responders and Non-responders between treatments, do not reinforce each other. In fact they are necessarily identical, since every patient must either respond or not respond. Moreover, the difference in the incidence of nausea accounts almost entirely for the difference in the incidence of all adverse events, so the statistical significance of the latter is entirely dependent on nausea and does not provide additional evidence of a poorer adverse profile.
M. D. VICKERS
University of Wales College of Medicine, Cardiff
Anaesthesiologische Klinik, Gutersloh, Germany
Searle, High Wycombe