Concerns have been raised about unexpected homogeneity of results for the Sputnik V vaccine and lack of transparency at multiple points in the development process.1–4 Most recently, Bucci et al identified that the purported vaccine efficacies across age subgroups were strikingly similar. We agree. They identified that the P value from a Tarone-adjusted Breslow–Day test was 0.9963, suggesting that these results are significantly more homogeneous across age groups than would be seen in most trials, even in the setting of a homogeneous underlying effect.2
The question of the reliability of the Sputnik V vaccine results is of global relevance. Many countries have approved the vaccine's use for the prevention of COVID-19 based on the reported efficacy in phase-3 trials and are even now approving the vaccine for the purposes of international travel. If these results are unreliable, it may have substantial implications for the utility of the vaccine and thus regulatory approval and the ongoing use to prevent COVID-19.
We set out to quantify the likelihood of the observed efficacy results across age strata (even if the underlying effect is perfectly homogeneous by age) falling within the range claimed, using a simulation study, and to compare these results with other vaccines about which similar concerns have not been raised.
We extracted the number of participants in each age stratum from the published phase-3 trial results for the Sputnik V vaccine (Gamaleya),5 and all other vaccines currently approved for use in the United States or Australia, namely, Vaxzevria (Astrazeneca),6 Johnson and Johnson Covid 19 Vaccine (Janssen),7 Spikevax (Moderna),8 and Cominarty (Pfizer).9 We also extracted the stated allocation ratio, the study-wise vaccine efficacy, and highest and lowest reported vaccine efficacy by age stratum. In line with the methods described for the Sputnik V vaccine, we calculated all vaccine efficacies (VEs) as “1-OR.”
We then simulated each trial in R10 by first allocating patients into age strata exactly as reported in the relevant study. We then randomly allocated patients in each age stratum to the treatment or control group (using the allocation ratio reported for each study) and then randomly determined infection status of each simulated participant using the observed study-wide infection rate for controls or treated participants, respectively, based on their simulated allocations, thus simulating under the hypothesis that VE was perfectly homogeneous by age group.
For each simulation, we recorded the highest calculated efficacy in any age subgroup, the lowest calculated efficacy in any age subgroup, and then checked whether either fell outside the bounds of subgroup efficacies recorded in the original article. 1000 simulations were initially planned for each vaccine, where if fewer than 20 acceptable simulation results were found for any vaccine, a further 49,000 simulations were run for all vaccines. There is no absolute method for determining acceptable precision in this setting, and this threshold was selected as the point at which the upper limit of the 95%CI for event frequency was 50% greater than the point estimate for event frequency using the Wilson asymmetric score interval.
For reproducibility, all code and data required to replicate this study are available at https://github.com/gtuckerkellogg/trial-homogeneity-sims. We compared results between vaccines, calculating both the percentage of trials in which all subgroups fell within the bounds of the published article, and in the inverse, the number of simulations required on average to generate a single “acceptable” result where no subgroup exceeded the bounds of all subgroups within the original article.
The number of age strata, allocation ratio, total number of patients, study-wise VE, and highest and lowest by-age-group VE are given in Table 1.
||Number of age subgroups
||Allocation ratio (T:C)
Table 1 shows the number of age subgroups, allocation ratio, total participants, study-wide VE, and lowest and highest VE in any age subgroup for each trial. Each trial's VE has been recalculated to match the definition in the study of Luganov et al of “1-OR” and so may differ slightly from those contained in the original articles, where risk ratios or hazard ratios were used.
In the initial simulation of 1000 trials for the AstraZeneca vaccine, in 23.8% of simulated trials, the observed efficacies of all age subgroups fell within the efficacy bounds for age subgroups in the published article. The J + J simulation showed 44.7%, Moderna 51.1%, and Pfizer 30.5%. The range of observed efficacy results of all 1000 simulated trials for each vaccine are presented in Figures 1A–D, respectively.
By contrast, of the first 1000 iterations for Sputnik, all age subgroups fell within the efficacy bounds for age subgroups in the published article in 0 (0.0%) simulations, and 1000 (100.0%) had at least 1 subgroup that fell outside the range of observed subgroup efficacies described in the published article. This is presented in Figure 1E.
Because of this finding, simulations were rerun with 50,000 simulated trials, and in this simulation, 0.026% of simulated Sputnik V trials had all age subgroups fall within the efficacy bounds for age subgroups in the published article. This result is presented in Figure 2A.
We calculated how many times, on average, a trial would need to be repeated to obtain results that fell within the range of efficacies from the relevant published article, based on the results from the 50,000 simulated trial run. These are presented in Figure 2B.
Our study results are concerning. Our simulations show that all vaccines currently approved in the United States and Australia each have a range of by-age subgroup efficacy results that is not unexpected for the observed study-wise efficacy, overall infection rate, and number of participants.
By contrast, the by-age subgroup-observed efficacies for the Sputnik phase-3 trial are much closer together than would be expected given the small number of patients in each age subgroup, very small number of infections, and high vaccine efficacy. Given the proceeding parameters, such a result would be expected in fewer than 1 in 1000 trials.
One important note is that we have answered the question “how likely is this range of results if the point estimate of efficacy is the true underlying efficacy”; it is not possible to know the true underlying efficacies and the point estimates in these phase-3 trials are the best estimates we have. One limitation is that this simulation study did not have a preregistered protocol.
We have also asked “what was the chance of the results falling within this range” rather than “what was the chance of the results being this homogeneous,” which is better answered by the use of goodness-fit-statistics by Bucci et al with a bootstrapped P value. These authors have already shown that the homogeneity reported in the Sputnik V trials was improbably high, whereas we have quantified the chance of the results falling within this particular specified range to a degree of precision at a chance of less than 0.1%.
Data availability is an issue for all vaccine trials11; however, in this case, lack of data availability is compounded by secrecy around trial protocols and statistical analysis plans, and it is compounded further by extremely improbably results as demonstrated in this article.
The authors of the original trial article have previously responded to criticisms of excessive homogeneity by claiming (incorrectly) that this just shows how homogeneous the underlying effect was. Thus, it is worth reiterating explicitly that the improbability of the claimed results demonstrated in this article is from a simulation assuming that the underlying vaccine effect is perfectly homogeneous across all age groups.
The results contained with the phase-III RCT of Sputnik by Luganov et al showed a distribution inconsistent with what would be expected from genuine experimental data.The results by age are substantially more similar than that could be expected given the very small number of infections. Only 0.07% of simulated trials gave results consistent with published article compared with 25.4%–45.7% for the phase III RCTs of the AZ, J + J, Moderna, or Pfizer vaccines.
Given the relative opacity of the conduct of this trial,12 the context of previous unexpectedly homogenous results, and the low likelihood of results such as these arising in a genuine trial, it is our opinion that it is not possible for a journal or reader to have confidence in the results, and the article should be thoroughly investigated, including immediate release of anonymized individual patient data to an unbiased statistical expert. If the authors are not willing to do this, the paper should be retracted.
It is not reasonably open to the publisher, readers, or regulators to assume the results are genuine, whereas such transparency is not forthcoming, given the scale of issues identified to date.
1. Bucci E, Andreev K, Björkman A, et al. Safety and efficacy of the Russian COVID-19 vaccine: more information needed. Lancet. 2020;396:e53.
2. Bucci EM, Berkhof J, Gillibert A, et al. Data discrepancies and substandard reporting of interim data of Sputnik V phase 3 trial. Lancet. 2021;397:1881–1883.
3. Vlassov V. Commentary on the Publication of Preliminary Results of the Sputnik-V Vaccine Phase 3 Trial. Russian Society for Evidence Based Medicine.
4. Andreev K. Note of Concern - an open letter to DY Logunov et al. In: toHorton R, ed. Of the Lancet. Samone, Trentino, Italy: Cattivi Scienziati.
5. Logunov DY, Dolzhikova IV, Shcheblyakov DV, et al. Safety and efficacy of an rAd26 and rAd5 vector-based heterologous prime-boost COVID-19 vaccine: an interim analysis of a randomised controlled phase 3 trial in Russia. Lancet. 2021;397:671–681.
6. Falsey AR, Sobieszczyk ME, Hirsch I, et al. Phase 3 safety and efficacy of AZD1222 (ChAdOx1 nCoV-19) covid-19 vaccine. New Engl J Med. 2021;385:2348–2360.
7. Sadoff J, Gray G, Vandebosch A, et al. Safety and efficacy of single-dose Ad26.COV2.S vaccine against covid-19. New Engl J Med. 2021;384:2187–2201.
8. Baden LR, El Sahly HM, Essink B, et al. Efficacy and safety of the mRNA-1273 SARS-CoV-2 vaccine. New Engl J Med. 2021;384:403–416.
9. Polack FP, Thomas SJ, Kitchin N, et al. Safety and efficacy of the BNT162b2 mRNA covid-19 vaccine. New Engl J Med. 2020;383:2603–2615.
10. R Core Team. R: A Language and Environment for Statistical Computing. 4.1.0 Ed. Vienna, Austria: R Foundation for Statistical Computing; 2021.
11. Tanveer S, Rowhani-Farid A, Hong K, et al. Transparency of COVID-19 vaccine trials: decisions without data. BMJ Evid Based Med. 2021. doi: 10.1136/bmjebm-2021-111735
12. Van Tulleken C. Covid-19: Sputnik vaccine rockets, thanks to Lancet boost. BMJ. 2021;373:n1108. doi: 10.1136/bmj.n1108