“Does this mean we can use P-values now?”
That question – perhaps tongue-in cheek – is one we have heard frequently as we assume editorial responsibility for Epidemiology. For those not familiar with the history of this journal, the use of P-values was strongly discouraged by Rothman, the journal’s founder and previous editor. 1 Given the rampant abuse of P-values in the epidemiologic literature and elsewhere, it is our belief that Rothman’s stringent policy has performed a service to the field.
But is that policy the right policy today? To help explore this question, we invited three knowledgeable researchers to provide their perspectives. These commentaries (by Clare Weinberg, Charlie Poole, and Steven Goodman) appear in this issue. 2–4 As commentaries should, they offer views that are opinionated, persuasive, and divergent.
Of all the tools of our trade, there is probably none more subject to abuse than the P-value. On this point, at least, our commentators agree. Readers with historical perspective will recognize that such criticism is not new. The origins of this discussion can be traced back to debates that pitted R.A. Fisher against J. Neyman and E. Pearson in the 1930s. 5 There is an entire website devoted to “326 Articles/Books Questioning the Indiscriminate Use of Statistical Hypothesis Tests in Observational Studies”. 6
From this avalanche of objections to P-values, we single out two. Statistical significance tests are subject to facile misinterpretation, even to the point of producing incorrect conclusions. Second, P-values are used to cull results in a mechanical fashion when more nuanced thinking is required. Even as we prepare this editorial, we find examples of P-value abuse in the current issues of respected journals. Epidemiologists – and epidemiologic journals – cannot afford to relax their guard.
At the same time, there may be settings in which P-values are informative. Weinberg offers some compelling examples. We do not disagree, and we are willing to consider such applications.
The mention of Bayesian principles by all three writers deserves comment. Goodman makes the case that experienced epidemiologists are naturally Bayesian in their practice of placing results within the broad context of existing evidence. Explicit Bayesian analysis may offer a systematic means to accomplish this. We plan to explore the epidemiologic applications of Bayesian methods in future issues of Epidemiology, and we would welcome papers that attempt to add a Bayesian dimension to their interpretation.
Does all this mean a change in Epidemiology’s policy on P-values? It may be no more than a change in perception. We will not ban P-values. But neither did Rothman. He called for caution, and we do the same. The question is not whether the P-value is intrinsically bad, but whether it too easily substitutes for the thoughtful integration of evidence and reasoning. Given the P-value’s blighted history, researchers who would employ the P-value take on a particularly heavy burden to do so wisely.
1. Lang JM, Rothman KJ, Cann CI. That confounded. P
-value. Epidemiology 1998; 9: 7–8.
2. Weinberg CR. It’s time to rehabilitate the P
-value. Epidemiology 2001; 12: 288–290.
3. Poole C. Low P-
Values or Narrow Confidence Intervals: Which are more durable? Epidemiology 2001; 12: 291–294.
4. Goodman SN. Of P-values and Bayes: A modest proposal. Epidemiology 2001; 12: 295–297.
5. Goodman SN. P-values, hypotheses tests and likelihood: implications for epidemiology of a neglected historical debate. Am J Epidemiol 1993; 137: 485–496.