Pediatric oncologists have had to deal with the challenge of small sample sizes in many of their studies because of the relative rarity of the tumors. With the rapid development of molecular micro-dissection of tumors, those treating adults with cancer are increasingly facing the same problem as the molecular variants multiply. Both groups have found ways to deal with this, mainly by doing the trials within traditional cooperative groups. However, that approach poses challenges, such as getting molecular analyses in a timely, reproducible manner, getting studies through the clinical trials process in a reasonable period of time, and avoiding the “dumbing down” of scientific imagination and innovation that large, lumbering committees too often end up with.
Another approach is to form ad hoc cooperative groups, or consortiums, focused on a single tumor type. Consortium members typically are from institutions that attract more than the average number of patients in that category and most had already been actively researching the tumor in question.
In pediatric oncology, these smaller, but more efficient and nimble groups (which might be called federations since the parent cooperative group is still engaged) have become increasingly popular.
This formation of ad hoc consortiums has also been used in the study of some adult cancers. An example is the series of classical studies of imatinib in the treatment of CML, which included a relatively small group of institutions from across the U.S. and internationally that were capable of identifying BCR-ABL mutations in their patients and agreed to a disciplined and imaginative scientific approach that moved very quickly.
But none of these solutions is perfect. The success in the cure of childhood ALL is one of the great stories of the evolution of cancer therapy, and the number of patients available was and is larger than any other pediatric cancer. But investigators are now caught in a double bind. For the 80 to 90 percent of patients who are cured, a radical change in therapy is hazardous. Modest changes have been made successfully, but either now or in the near future, that model is likely to yield diminishing returns. Also, while successful at curing patients, ALL therapy is quite toxic, very expensive, and fraught with potential long-term side effects.
More to the point, the childhood ALL field is left with the 10 to 15 percent of patients who are not cured with initial therapy. That small number would not be suited to traditional Phase 3 clinical trials. There are clues to why treatment for these patients fails, but it does not appear to be a single factor. Thus, an already small group that becomes resistant to initial therapy is reduced to several even smaller groups.
So the problem of small sample size for cancer clinical trials remains and is likely to grow in both pediatric and adult cancers.
A Different Method
Which brings me to a different method of dealing with the problem, which became clearer to me in a recent conversation with Richard Sposto, a veteran biostatistician at the USC-Norris Comprehensive Cancer Center. I will need to back up a bit to show how this has evolved.
The traditional approach to Phase 3 randomized cancer trials is to have enough patients to detect moderate but clinically important differences between treatments. Traditionally, this calls for meeting standards using a type 1 error, usually the well-known p < 0.05, and a power of 80 percent. This became the gold standard for a “clinically relevant” difference. But this approach does not make sense for Phase 3 trials in rare diseases or small subsets of more common diseases.
Sposto and Stram offered an answer for Phase 3 trials in their 1999 paper.1 They used a similar approach introduced by Strauss and Simon2 for Phase 2 studies. They showed (and I will tiptoe past the math, which is Greek to me) that “[for] an extended research program of a series of two-treatment randomized clinical trials in low incidence paediatric cancer, [it is] a better strategy to require less compelling evidence (e.g., p < 0.20) for adopting an experimental treatment than that required by the typical five percent type 1 error rate, and to perform trials smaller than those judged adequate under current statistical dogma.”
‘Extended’ and ‘Series’
The key words here are “extended” and “series.” While there is a greater likelihood of a choosing an ineffective therapy, the authors conclude that the accumulating knowledge on the potential efficacy of experimental treatments are more likely to produce gains in efficacious therapy than insistence on large trials and stringent evidence, which they believe is more appropriate for long-term strategies.
In short, commitment to a long series, over decades in some cases, of small randomized trials can collectively provide useful information for going forward despite sample sizes too small for traditional measure of statistical significance. Sposto told me this was “putting the main focus on the long view instead of any single clinical trial.”
He emphasized the point that their approach applied only in a context where one has essentially all of the patients that could be had for a study—i.e., a large proportion of the entire target population (e.g., all the available patients with a certain genetic mutation)—and could not be used to justify small, single-institution randomized trials when there was an ability to do large trials in the same amount of time or more quickly.
Le Deley et al in a 2012 publication, affirmed this approach.3 They reported the results of simulating a series of two-treatment superiority trials over 15 years using different design parameters, including the number of positive trials needed to superiority at p-values ranging from 2.5 to 50 percent and the sample size required. They found the following: “Expected trial benefits over the 15-year horizon were maximized when more (smaller) trials were conducted than recommended under traditional criteria, using the criterion of one positive trial (vs. two), and relaxing the [p-value] from 2.5 percent to 20 percent.”
Although far from perfect and dependent on the type of disease under study (e.g., relatively responsive to treatment or not), the authors concluded that this approach was a worthy option in appropriate situations.
My simplistic conclusion is two-fold: the shrinkage of the number of patients in trials due to a low incidence or growing subsets is a challenge to clinical trials; the more empirical, “relaxed p level” approach can work well if one perseveres with continuous and detailed examination of outcomes in a long series of studies. This has worked well for pediatric oncology and should work in other cancer studies.