The Editors' Notepad

The goal of this blog is to help EPIDEMIOLOGY authors produce papers that clearly and effectively communicate their science.

Thursday, May 20, 2010

On Odds Ratios and Relative Risk: Why Do We Still Discuss a Problem That Was Solved 30 Years Ago?

The January 2010 issue of EPIDEMIOLOGY contained a spate of papers about our trade’s fixation with relative measures of risks – amongst others, the odds ratio in case-control studies. This continuing discussion remains a source of profound wonder.  Our fixation with the odds ratio in case-control studies has its origin in Cornfields 1951 paper [1] in which he proposed the “rare disease assumption” to turn the odds ratio into a relative risk. For all intents and purposes, this approach should have been buried 30 years ago after the publication of Miettinen’s “Estimation and Estimability in Case-Referent Studies”[1].

The quantity that we calculate from case-control studies was not always known as the odds ratio; neither was the “rare disease assumption” inevitable. In the discussion section of their 1950 case-control study on smoking and lung cancer, Doll and Hill wrote [1] – a year before Cornfield:

“If it can be assumed that the patients without carcinoma of the lung who lived in Greater London at the time of their interview are typical of the inhabitants of Greater London in regard to their smoking habits, then the number of people in London smoking different amounts of tobacco can be estimated. Ratios can be obtained between the number of patients seen with carcinoma of the lung and the populations at risk who have smoked comparable amounts of tobacco.”

These ratios are not actual risks, they wrote, but are proportional to those risks. Upon dividing these ratios, Doll and Hill  presented relative risks as a direct estimate, without the rare disease assumption. Just imagine that if Cornfield had had his flash of insight a few years later. Doll and Hill’s approach might have become the landmark example. Then, to teach and explain case-control studies, we might have used “density sampling” as the most natural thing in the world. In “density sampling” the ratio of exposed vs. non-exposed persons in the control group stands for a ratio of exposed vs. unexposed person-years from which cases emerge (and not for the proportion of exposed or unexposed persons who do not become diseased at the end of a fixed follow-up). Epidemiology could have developed without the rare disease assumption. This is not a mere flight of fancy: the quantity that was described in words by Doll and Hill is called a “pseudo-rate” in the 2009 edition of by Rothman, Greenland and Lash, with a similar rate ratio calculation as the basis for understanding case-control studies (page 113).

Since the 1980s, the different meanings of the odds ratio in diverse sampling situations have been refined in numerous papers and textbooks. Yet, this has failed to influence the practice of epidemiology. In a recent overview, we found that authors overwhelmingly prefer the term “odds ratio” in case-control studies, without any further interpretation; the few times that authors try to explain what their odds ratios mean, they make errors [2]. More surprisingly, a large number of textbooks still uses Cornfield’s 1951 teaching of the rare-disease assumption as if nothing happened over the past half century (list of textbooks and detailed results in [2]). In a personal discussion, one textbook author defended this practice, saying he felt that the rare-disease assumption was still the easiest way to explain case-control studies: any audience would immediately grasp it, without further background knowledge.

Granted, Cornfield’s 1951 paper was an enormous leap forward, and should be credited for making a first step in statistically formalizing case-control studies and making them credible. Doll and Hill never formalized what they did. Yet, why do we still use Cornfield’s teaching to explain case-control studies? A colleague who took the Distance Learning Programme at the London School of Hygiene and Tropical Medicine was told to use the long- antiquated reasoning to pass the basic course exam; the insights that have been developed over the past three decades would count as wrong answers. Granted again, to comprehend why an odds ratio can be a rate ratio, a risk ratio or a prevalence odds ratio (or any other terminology that you fancy), necessitates some basic knowledge about open vs. closed cohorts (dynamic vs. fixed populations, if you prefer those terms).

My diagnosis is that we persevere with our odds-ratio fixation in case-control studies because of deficient teaching in our basic epidemiology courses. It must be possible to set up teaching modules that explain incidence rates in open populations, without needing much more than high-school algebra. If such teaching exists in some “Epidemiology 101”, please tell the epidemiologic community, because it makes the bridge to “density sampling” immediate and natural. Density sampling in an open population is the basis of comprehension of case-control studies. Further refinements about the rarely-applied nested or case-cohort designs in closed cohorts [2] can be left to advanced courses. Thus, we might once-and-for-all replace the odds ratio and rare-disease assumption with knowledge that has existed already for more than 30 years!

If you like to comment, Email me directly at or submt your comment via the journal which requires a password protected login. 


[1] pdfs of all historical papers can be found on the website of the People’s Epidemiologic Library, a new site dedicated to the history of epidemiologic methods.  See:

[2] Knol MJ, Vandenbroucke JP, Scott P, Egger M. What do case-control studies estimate? Survey of methods and assumptions in published case-control research. Am J Epidemiol 2008;168:1073-81.

© 2010 Jan P Vandenbroucke