Share this article on:

Sequential Meta-analysis of Past Clinical Trials to Determine the Use of a New Trial

Bollen, Casper W.*; Uiterwaal, Cuno S. P. M.; van Vught, Adrianus J.*†; van der Tweel, Ingeborg

doi: 10.1097/01.ede.0000239658.19288.22
Original Article

Background: Clinical trials can be stopped early based on interim analyses or sequential analyses. In principle, sequential analyses can also be used to decide whether enough evidence has been gathered in completed trials to make further trials unnecessary. We demonstrate such an application through a retrospective analysis of clinical trials comparing ventilation methods for the treatment of preterm newborns.

Methods: We identified 5 recent trials that compared high-frequency ventilation with conventional mechanical ventilation in the treatment of preterm newborns. Death or chronic lung disease and chronic lung disease in survivors were the primary clinical outcomes of interest. We applied sequential meta-analyses to these 5 studies.

Results: After including the first study of the last 5 trials in a sequential meta-analysis, the boundary of “no clinically relevant effect” was crossed for both outcomes (death or chronic lung disease). A sensitivity analysis using a reduction in the size of assumed clinically relevant effect showed the same findings after 2 trials.

Conclusions: Sequential meta-analyses showed that a lack of clinically relevant effect had been established after the first of the 5 trials. If such an analysis had been conducted after the first or second of these clinical trials, it might have led to changes in the study design of subsequent trials or even to a reassessment of the need for further trials.

Supplemental Digital Content is Available in the Text.

From the *Pediatric Intensive Care Unit and the †Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands; and the ‡Center for Biostatistics, Utrecht University, Utrecht, The Netherlands.

Submitted 10 November 2005; accepted 12 June 2006.

Supplemental material for this article is available with the online version of the journal at

Correspondence: Casper W. Bollen, University Medical Center Utrecht, PO Box 85090, 3508 AB Utrecht, The Netherlands. E-mail:

Back to Top | Article Outline


Click on the links below to access all the ArticlePlus for this article.

Please note that ArticlePlus files may launch a viewer application outside of your web browser.

The decision to start a new randomized clinical trial (RCT) should depend on the expected ability of such a trial to change current clinical opinion taking into account previously obtained evidence. It is standard practice to estimate the required size of a randomized clinical trial a priori based on the expected clinically relevant difference among treatments, the power (1-β), and the significance level (α). Stopping randomized clinical trials early, before the estimated fixed size is reached, is readily accepted for ethical or economic reasons. One or more interim analyses can be planned to determine whether enough evidence has been obtained to discontinue a trial prematurely. Interim analyses are performed on cumulative data of patients enrolled in the RCT. Sequential testing is the collective term for these interim analyses. We speak of continuous sequential testing when cumulative data are analyzed after every new patient response. Group sequential testing is a series of interim analyses after every new group of patient responses.

A meta-analysis pools the results of a number of comparable RCTs in a systematic and quantitative way.1 A cumulative meta-analysis can be viewed as a number of interim analyses on the aggregated data of successive, chronologically ordered RCTs. A cumulative meta-analysis is thus a group sequential test in which each group represents patients from another trial.2 We propose sequential meta-analysis as a particular form of a cumulative meta-analysis with adjustment for multiple testing and a guaranteed power. We demonstrate the use of sequential meta-analysis through a retrospective analysis of complete clinical trials.

The trials in this analysis come from a clinical problem in neonatology. Ventilator-induced lung damage is a major complication for preterm newborns.3 A considerable number of RCTs have been performed to determine whether high-frequency oscillatory ventilation improves pulmonary outcome in premature neonates with idiopathic respiratory distress syndrome compared with conventional mechanical ventilation.4–16

Two recent large RCTs failed to demonstrate an advantage of high-frequency oscillatory ventilation over conventional mechanical ventilation or showed only a small benefit.14,15 A meta-analysis showed no reduction in mortality. However, there was a small reduction in the risk of chronic lung disease at 36 to 37 weeks postconceptional age.17 We used sequential meta-analysis to determine at what point in time additional trials did not contribute further evidence.

Back to Top | Article Outline


In a previous report, we identified 13 studies in which high-frequency ventilation was compared with conventional mechanical ventilation in the treatment of idiopathic respiratory distress syndrome in premature neonates.18 The most recent 5 studies were comparable with respect to patient population, type of high-frequency ventilation (oscillator), and ventilation strategies.11,13–16 These 5 studies were included chronologically in our sequential meta-analysis. The following data were extracted: gestational age or birth weight, time from delivery to randomization (in hours), type of high-frequency ventilator, ventilation strategies applied in both treatment arms, primary outcome measurements, and power and estimated effect size on which power analysis was based. The following outcome measures were identified: chronic lung disease (defined as oxygen dependency at the postconceptional age of 36 weeks), mortality to 36 weeks of age, intraventricular hemorrhage grade III and IV, and periventricular leukomalacia.

A high lung-volume strategy with high-frequency ventilation was assumed if 2 or more of the following items were explicitly stated in the methods: initial use of a higher mean airway pressure than on conventional mechanical ventilation, initial lowering of inspired oxygen before reducing mean airway pressure, or use of alveolar recruitment maneuvers. A lung protective strategy in the conventional mechanical ventilation group was based on specifying the PaCO2 goal, allowing permissive hypercapnia, and a high initial ventilatory rate or explicit avoidance of high peak inspiratory pressures targeted at reducing tidal volumes.

Back to Top | Article Outline

Statistical Analysis

We deduced an a priori estimate of a clinically important effect size of the primary outcome based on the expected clinically relevant differences reported in the power analyses for the trials. A probability of 0.05 for a type I error and a power of 0.80 were specified in our sequential meta-analyses. We carried out sensitivity analyses by decreasing the clinically relevant differences in effect estimates and by excluding studies by Thome et al11 and Moriette et al13 from the analyses. Thome et al11 used a different type of ventilator, and the ventilator used by Moriette et al13 was withdrawn from the market. Reducing the size of the clinically interesting effect would be expected to require a larger sample size for that difference to be detected. Sensitivity analysis thus was conducted to rule out the need for more trials to establish smaller clinically relevant differences.

All data were extracted according to the intent-to-treat principle. All randomized patients were put into the denominator with patients who died or who had chronic lung disease in the numerator. To calculate the risk of chronic lung disease, the denominator was the number of patients who survived and the numerator was the number of patients with chronic lung disease. Intraventricular hemorrhage grade III and IV and periventricular leukomalacia were determined with the number of randomized patients in the denominator. Statistical heterogeneity between trials was investigated by calculating the test statistic I 2:I 2 = 100% × (Qdf)/Q, in which Q is Cochran's heterogeneity statistic and df the degrees of freedom.19 I2 can be interpreted as the proportion of variation due to heterogeneity.

Back to Top | Article Outline

Sequential Meta-analysis

The ith RCT contributes 2 quantities, V i and Z i, to the cumulative analysis. V i is a measure of the amount of information in that RCT, ie, V i is approximately proportional to the number of patients included in that RCT. Z i is a measure for the effect size in that RCT. After every new RCT, the total amount of information is cumulated in V = ΣV i and Z = ΣZ i. Z and V are thus the pooled results from the accumulated trials and the sequential meta-analysis can be viewed as a stratified analysis (see Appendix). Every new RCT contributes a new (Z,V) point, which can be depicted in a graph with V on the horizontal and Z on the vertical axis. Four boundaries are plotted in the graph. These boundaries depend on the two-sided type I error α, the power 1-β, and the expected effect size (in terms of the logarithm of the odds ratio [OR]) as stated under the alternative hypothesis (Fig. 1). If one of the successive (Z,V) points falls within the upper or lower boundary, the sequential meta-analysis can be stopped: the null hypothesis of treatment equivalence has been rejected in favor of the alternative hypothesis, ie, sufficient evidence has been gathered for the expected effect size. If one of the successive (Z,V) points falls within one of the inner, wedge-shaped boundaries, the sequential meta-analysis can be stopped for “futility”: the null hypothesis is very unlikely to be rejected in favor of alternative hypotheses. If the successive (Z,V) points remain within the triangular boundaries, no decisions can be made, and results of a new RCT are added to the analysis. The outer straight-line boundaries (blue and red in the figures) represent the theoretical limits for decision-making. The inner, curved boundaries (green in the figures) represent a continuity correction, because the unit of analysis is the trial (a group of patients) and not individual patients. (For illustration, see Fig. 1) When one of the inner boundaries is crossed, the analysis can be stopped. (For further details on the construction of the boundaries and on sequential analysis, see Whitehead and Whitehead,1 Whitehead,21 and the operating manual for PEST 4.20)





Back to Top | Article Outline


We analyzed 5 high-frequency ventilation studies with a total number of 2152 randomized patients. Those 5 trials compared a high-frequency oscillatory ventilator with conventional mechanical ventilation. Table 1 provides numbers of the outcomes of interest. Taken together, those 5 studies showed that high-frequency oscillatory ventilation had an OR of 0.92 (95% confidence interval [CI] = 0.77–1.09) for death or chronic lung disease, 0.98 (0.80–1.21) for chronic lung disease in survivors, 1.01 (0.79–1.29) for intraventricular hemorrhage grade III and IV, and 0.90 (0.62–1.33) for periventricular leukomalacia.



Table 2 presents the patient groups, primary outcomes, and sample size specifications. All studies included very-low-birthweight patients. Time before randomization was no more than 6 hours. Thome et al11 and Moriette et al13 used variants of the definition for the primary outcome on which a power analysis was based. However, in both studies, death and chronic lung disease were part of the primary outcome. Overall, a reduction in death or chronic lung disease of 15% was expected (corresponding to an OR of 0.54 assuming a baseline risk of 65%). All trials specified a value of 0.05 for the type I error α. Power for detecting a difference was 0.80 or 0.90.



Treatment with high-frequency oscillatory ventilation was comparable among trials (Appendix Table available with the online version of this article). The ventilation strategy in the conventional mechanical ventilation groups also did not differ much among trials. Inconsistency in primary outcome assessed by I 2 was 7.5% indicating a low percentage of total variation across studies due to heterogeneity.

Sequential meta-analysis showed that the first of the 5 trials was enough to rule out a 15% reduction in death or chronic lung disease (Fig. 2A). In a sensitivity analysis that decreased the effect to a reduction of 10%, it took only 2 trials before the boundary for “no reduction” was crossed (OR = 0.97; 95% CI = 0.68–1.41) (Fig. 2B). Sensitivity analysis excluding the studies by Thome et al11 and Moriette et al13 resulted in an OR of 0.98 (0.68–1.39) (data not shown). The same result was found with chronic lung disease alone as the outcome, with an estimated effect of 15% reduction (Fig. 2C). After one trial (by Thome et al11), the boundary for “no reduction” was crossed (0.89; 0.50–1.58). Sequential analyses were also applied with intraventricular hemorrhage grade III and IV and periventricular leukomalacia as outcome measures. For both outcomes, there was not enough evidence in the 5 trials to draw a definitive conclusion (data not shown).

Back to Top | Article Outline


To be of value, a new RCT must add useful information. Assessing whether clinical equipoise was present at the start of a new RCT should be general research practice.22 As Chalmers has observed, “Science is meant to be cumulative, but many scientists are not cumulating scientifically.”23 Cumulative meta-analysis is a recognized technique for systematic review. Various authors have performed cumulative meta-analyses of RCTs.22,24 The usual approach is to analyze the available studies, testing the null hypothesis that the 2 treatments are equally effective. If the test is not statistically significant, a new trial is added (when its results become available) and the analysis is repeated. This approach continues until a statistically significant result is found, ie, until the null hypothesis is rejected. Berkey et al25 noticed that this general approach does not adjust for the multiple testing and lacks either a formal stopping rule or a way to quantify the power of the conclusion. We performed a sequential meta-analysis according to the approach of Whitehead.26 Using this approach, the overall significance level α (the type I error) is preserved, thus preventing the increase of the cumulative α by multiple testing. A prespecified power to detect a clinically relevant treatment difference is guaranteed. Furthermore, this approach permits stopping when there is enough evidence either to reject the null hypothesis of treatment equivalence or to not reject the null hypothesis.

This is a second report that discusses the relevance of new trials using sequential meta-analysis.27 We conducted this analysis retroactively and found that the first of 5 recent trials was enough to demonstrate lack of benefit. Four more studies were performed powered to show the same amount of effect.13–16 All of these trials occurred at approximately the same time, which would have limited the application of sequential meta-analysis in real time for this specific example. Nonetheless, this example demonstrates the potential use of the approach.

To compare trials, it is important that there is sufficient similarity of treatment. For example, ventilation strategies in high-frequency oscillatory ventilation and conventional mechanical ventilation have changed in recent years.18 In the previous cumulative meta-analysis, ventilation strategies were an important source of heterogeneity between trials.18 In the last 5 trials, however, ventilation strategies were comparable and results were homogeneous among trials. Only a small amount of variation among trials was due to heterogeneity of treatment.

The most important differences among the earlier trials were the use of surfactant therapy and the application of a lung protective strategy in patients on conventional mechanical ventilation.18,28 Both modalities were applied in the last 5 trials. In the only trial that showed a reduction in chronic lung disease, the conventional mechanical ventilation therapy was most rigidly controlled.14 Therefore, it seems unlikely that in daily practice, the same difference between high-frequency oscillatory ventilation and conventional mechanical ventilation will occur.29

In general, the size of a trial is estimated by a power analysis that is based on a clinically relevant effect size and chosen probabilities for type I and II errors. However, this does not answer the question whether this new trial will be able to adjust the available cumulative evidence sufficiently to conclude that a clinically relevant effect can be refuted or accepted. By performing a sequential analysis (ie, a sequential meta-analysis of earlier comparable trials), it can be decided whether enough cumulative evidence has already been gathered to render another trial uninformative. In this report, we apply sequential meta-analysis to a series of randomized trials to demonstrate its use in assessing the accumulating evidence in a series of trials.

Sequential meta-analysis of earlier comparable studies should be an integral part in the planning and design of new randomized trials. As we have shown, sequential meta-analyses can provide useful information that could affect the design of further trials or even affect the decision as to whether further trials are necessary.

Back to Top | Article Outline


1. Whitehead A, Whitehead J. A general parametric approach to the meta-analysis of randomized clinical trials. Stat Med. 1991;10:1665–1677.
2. Young C, Horton R. Putting clinical trials into context. Lancet. 2005;366:107–108.
3. MacIntyre NR. Current issues in mechanical ventilation for respiratory failure. Chest. 2005;128:561S–567S.
4. Froese AB, Butler PO, Fletcher WA, et al. High-frequency oscillatory ventilation in premature infants with respiratory failure: a preliminary report. Anesth Analg. 1987;66:814–824.
5. High-frequency oscillatory ventilation compared with conventional mechanical ventilation in the treatment of respiratory failure in preterm infants. The HIFI Study Group. N Engl J Med. 1989;320:88–93.
6. Carlo WA, Siner B, Chatburn RL, et al. Early randomized intervention with high-frequency jet ventilation in respiratory distress syndrome. J Pediatr. 1990;117:765–770.
7. Ogawa Y, Miyasaka K, Kawano T, et al. A multicenter randomized trial of high-frequency oscillatory ventilation as compared with conventional mechanical ventilation in preterm infants with respiratory failure. Early Hum Dev. 1993;32:1–10.
8. Wiswell TE, Graziani LJ, Kornhauser MS, et al. High-frequency jet ventilation in the early management of respiratory distress syndrome is associated with a greater risk for adverse outcomes. Pediatrics. 1996;98:1035–1043.
9. Keszler M, Modanlou HD, Brudno DS, et al. Multicenter controlled clinical trial of high-frequency jet ventilation in preterm infants with uncomplicated respiratory distress syndrome. Pediatrics. 1997;100:593–599.
10. Rettwitz-Volk W, Veldman A, Roth B, et al. A prospective, randomized, multicenter trial of high-frequency oscillatory ventilation compared with conventional ventilation in preterm infants with respiratory distress syndrome receiving surfactant. J Pediatr. 1998;132:249–254.
11. Thome U, Kossel H, Lipowsky G, et al. Randomized comparison of high-frequency ventilation with high-rate intermittent positive pressure ventilation in preterm infants with respiratory failure. J Pediatr. 1999;135:39–46.
12. Plavka R, Kopecky P, Sebron V, et al. A prospective randomized comparison of conventional mechanical ventilation and very early high-frequency oscillatory ventilation in extremely premature newborns with respiratory distress syndrome. Intensive Care Med. 1999;25:68–75.
13. Moriette G, Paris-Llado J, Walti H, et al. Prospective randomized multicenter comparison of high-frequency oscillatory ventilation and conventional ventilation in preterm infants of less than 30 weeks with respiratory distress syndrome. Pediatrics. 2001;107:363–372.
14. Courtney SE, Durand DJ, Asselin JM, et al. High-frequency oscillatory ventilation versus conventional mechanical ventilation for very-low-birth-weight infants. N Engl J Med. 2002;347:643–652.
15. Johnson AH, Peacock JL, Greenough A, et al. High-frequency oscillatory ventilation for the prevention of chronic lung disease of prematurity. N Engl J Med. 2002;347:633–642.
16. Van Reempts P, Borstlap C, Laroche S, et al. Early use of high-frequency ventilation in the premature neonate. Eur J Pediatr. 2003;162:219–226.
17. Henderson-Smart DJ, Bhuta T, Cools F, et al. Elective high-frequency oscillatory ventilation versus conventional ventilation for acute pulmonary dysfunction in preterm infants. Cochrane Database Syst Rev. 2003;CD000104.
18. Bollen CW, Uiterwaal CS, van Vught AJ. Cumulative metaanalysis of high-frequency versus conventional ventilation in premature neonates. Am J Respir Crit Care Med. 2003;168:1150–1155.
19. Higgins JP, Thompson SG, Deeks JJ, et al. Measuring inconsistency in meta-analyses. BMJ. 2003;327:557–560.
20. PEST 4: Operating Manual. MPS Research Unit. The University of Reading; 2000.
21. Whitehead J. The Design and Analysis of Sequential Clinical Trials (revised 2nd ed). Chichester: John Wiley & Sons Ltd; 1997.
22. Fergusson D, Glass KC, Hutton B, et al. Randomized controlled trials of aprotinin in cardiac surgery: could clinical equipoise have stopped the bleeding? Clin Trials. 2005;2:218–229.
23. Chalmers I. The scandalous failure of science to cumulate evidence scientifically. Clin Trials. 2005;2:229–231.
24. Lau J, Antman EM, Jimenez-Silva J, et al. Cumulative meta-analysis of therapeutic trials for myocardial infarction. N Engl J Med. 1992;327:248–254.
25. Berkey CS, Mosteller F, Lau J, et al. Uncertainty of the time of first significance in random effects cumulative meta-analysis. Control Clin Trials. 1996;17:357–371.
26. Whitehead A. A prospectively planned cumulative meta-analysis applied to a series of concurrent clinical trials. Stat Med. 1997;16:2901–2913.
27. Pogue JM, Yusuf S. Cumulating evidence from randomized trials: utilizing sequential monitoring boundaries for cumulative meta-analysis. Control Clin Trials. 1997;18:580–593.
28. Froese AB, Kinsella JP. High-frequency oscillatory ventilation: lessons from the neonatal/pediatric experience. Crit Care Med. 2005;33:S115–S121.
29. Stark AR. High-frequency oscillatory ventilation to prevent bronchopulmonary dysplasia—are we there yet? N Engl J Med. 2002;347:682–684.
Back to Top | Article Outline


Suppose k RCTs are available for a sequential meta-analysis. All RCTs compare the same experimental treatment E with a control treatment C and all have the same dichotomous outcome (event or no event). Results from the ith RCT (I = 1, …, k) can be summarized as follows:



The proportions of events with the experimental and with the control treatment are PEi = SEI/NEI and PCi = SCi/NCi, respectively.

The logarithm of the odds ratio, as a measure for association between treatment and outcome, is defined as

The test statistic Zi is expressed as the difference between the observed number of events with E in the ith RCT (SEi) and the expected number under the null hypothesis of treatment equivalence.

The statistic Vi, the variance of Zi, is defined as

The pooled estimate for the overall θ is equal to

as the estimated log(OR) for the ith RCT and the weighting factor wi = Vi.

An approximate 95% confidence interval for θ can be estimated by

(For further details, see references 1, 20, and 21.)

Supplemental Digital Content

Back to Top | Article Outline
© 2006 Lippincott Williams & Wilkins, Inc.