I have been involved in research for decades in one capacity or another. After a short laboratory career I mainly worked in clinical research, writing protocols, studying the data, and making analytic sense of the results. At its best, clinical research provides a step ahead in the process of devising better therapy or less toxic therapy for patients. Unfortunately, published clinical research often is not at its best. Below are some of the flaws that my colleagues and I still observe in clinical trials:
Bias caused by a failure to ensure the close similarity of patients in the comparative arms of a study by ignoring potentially important differences in patients, such as a difference in prior treatment, patients with a non-oncologic disease such as cardiac dysfunction, or low tolerance to chemotherapy. If these biases are not considered in the protocol, the physician may be forced to make the patient ineligible for good reasons, but this also can compromise the objectivity of the study.
Poorly designed studies that are clearly unlikely to lead to a firm, defensible, important conclusion are the bane of clinical trials. I skim the abstracts at meetings and typically put them in three categories: (1) Sounds interesting, will try to attend; (2) The design seems faulty, have no interest; and (3) Who cares? The question is trivial.
Group 3 includes a large number of the abstracts, I am sad to say.
Bias during a Study
There are many potential sources of bias, often unintentional, during a study such as failure to record some events or tests accurately, allowing “minor” deviations in dosages or schedules that are not recognized as valid in the protocol. Occasional variations probably have little effect on the outcome, but if these exceptions become common, it can affect the results.
Cooperative group studies that involve many doctors distributed over the entire country can also lead to unintentional bias due to different interpretations of the protocol or different practice styles. Also, doctors often have more faith in a certain treatment than another, and this may lead to doing normally inconsequential things in managing the patient that favor the familiar and trusted therapy.
Marginal Significance with Large Numbers
If one is wedded to the p value as the arbiter of success or failure of a treatment, a very large population of patients in the study can make the p value “significant” while the clinical impact of the therapy is clinically insignificant.
Here is an example: At an ASCO meeting, as at many meetings, the plenary session is a platform for the “most significant” studies submitted for presentation. I attended such a session years ago, where a study of lung cancer treatment with a very large number of patients enrolled was presented. The new treatment being tested was shown to be significantly better, with a p value of 0.05. However, the duration of survival was only a few weeks longer than with the control therapy. I don't know about you, but living two or three weeks longer receiving aggressive chemotherapy would not be a selling point I would like to offer a patient.
The bottom line: Common sense and clinical significance can trump the p value.
Lack of Verification
But the biggest problem in clinical research is the lack of verification of results by an independent research team. A majority of abstracts submitted for presentation at ASCO and other meetings are never verified. Either others could not replicate the results or a verifying study was never done, so we are left with what is basically a result that was uncontrolled in the strict scientific sense of replication by an independent scientific group. This problem is apparently not limited to oncology or ASCO or even clinical research.
In fact, a Stanford investigator, John Ioannidis, has collected evidence from medical studies that supports this belief, beginning with a 2005 groundbreaking paper: “Why Most Published Research Findings are False” (PLOS Medicine 2005; DOI: 10.1371/journal.pmed.0020124). The paper has been controversial, but it is now one of the most cited papers addressing the topic of scientific research.
I first became aware of this when reading a New York Times article earlier this year, “New Truths that Only One Can See” in George Johnson's “Raw Data” science column (1/20/14). This led me to David Freedman's article in The Atlantic (Nov. 2010) describing the evolution of Ioannidis's analyses of many studies. Both Johnson and Freedman represent the issues and outcomes in a clear way accessible to the average reader—which is helpful since some of Ioannidis's text is complex and rather “nerdy.”
Freedman wrote that Ioannidis was shocked at the range and reach of the reversals of results he was seeing in everyday medical research. “Randomized controlled trials,” which compare how one group responds to a treatment against how an “identical” group fares without the treatment, had long been considered nearly unshakable evidence, but they, too, ended up being wrong some of the time.
“I realized that even our gold standard research had a lot of problems,” he says. Baffled, he started looking for the specific ways in which studies were going wrong. And before long he discovered that the range of errors being committed was astonishing: from what questions researchers posed, to how they set up the studies, to which patients they recruited for the studies, to which measurements they took, to how they analyzed the data, to how they presented their results, and to how particular studies came to be published in medical journals.
This array suggested a bigger, underlying dysfunction, and Ioannidis thought he knew what it was. “The studies were biased,” he says. “Sometimes they were overtly biased. Sometimes it was difficult to see the bias, but it was there.”
Researchers headed into their studies wanting certain results—and, lo and behold, they were getting them. We think of the scientific process as being objective, rigorous, and even ruthless in separating out what is true from what we merely wish to be true, but in fact it's easy to manipulate results, even unintentionally or unconsciously.
“At every step in the process, there is room to distort results, a way to make a stronger claim or to select what is going to be concluded,” Ioannidis says. “There is [often] an intellectual conflict of interest that pressures researchers to find whatever it is that is most likely to get them funded.”
Here is another quote from the Freedman article that was written after he had visited Ioannidis and his team to observe the process of review in several meetings to understand how he reached his conclusions:
“Indeed, given the breadth of the potential problems raised at the meetings, can any medical-research studies be trusted?
“That question has been central to Ioannidis's career,” Freedman continues. “Ioannidis is what's known as a meta-researcher, and he's become one of the world's foremost experts on the credibility of medical research. He and his team have shown, again and again, and in many different ways, that much of what biomedical researchers conclude in published studies—conclusions that doctors keep in mind when they prescribe antibiotics or blood-pressure medication, or when they advise us to consume more fiber or less meat, or when they recommend surgery for heart disease or back pain—is misleading, exaggerated, and often flat-out wrong.
He charges that as much as 90 percent of the published medical information that doctors rely on is flawed.
“His work has been widely accepted by the medical community; it has been published in the field's top journals, where it is heavily cited; and he is a big draw at conferences,” Freedman writes. “Yet for all his influence, he worries that the field of medical research is so pervasively flawed, and so riddled with conflicts of interest, that it might be chronically resistant to change—or even to publicly admitting that there's a problem.”
Johnson pointed out in his column that concern has arisen in some high places—e.g., the journal Nature has assembled an archive, filled with reports and analyses, called “Challenges in Irreproducible Research.”
C. Glenn Begley wrote one of the papers. While he was at Amgen, he and his colleagues could not replicate 47 of 53 landmark papers about cancer. Some of the results could not be reproduced even with the help of the original scientists working in their own lab.
Could things be as bad as Ioannidis believes? I don't know. But if he is even half right, this is a big deal and those of us in clinical cancer research should feel responsible for the research we publish and act to determine if and when we have failed to produce reliable, reproducible and important data.