In this issue of Epidemiology, Press and Pharoah1 have reanalyzed Janet Elizabeth Lane-Claypon's 1926 case–control study of breast cancer.2 My commentary situates this landmark publication in the history of epidemiologic methods by addressing 3 questions: Is the 1926 study the “first” case–control study known to us? How does the study compare with other epidemiologic studies conducted between 1900 and 1945? What has changed regarding the way epidemiologists conceptualize case–control studies since the 1926 study?
DID LANE-CLAYPON CONDUCT THE “FIRST” CASE–CONTROL STUDY?
Most group comparisons conducted prior to 1926 lacked clear directionality.3 (p. 67) Take Pierre Louis's case-series of patients admitted for pneumonia in a Paris hospital.4 The association between survival from pneumonia and time of first bloodletting after disease onset can be equivalently assessed comparing the patients bled early versus those bled late (prospective contrast), or comparing the patients who died versus those who survived (retrospective contrast). Paneth et al5 view the Reverend Whitehead's investigation of the cause of the 1854 Broad Street cholera outbreak6 as the first case–control study because it was originally based on a retrospective contrast. After having found that only 17% of Broad Street resident survivors had drunk water from the pump, Whitehead queried the survivors about the water-drinking habits of their deceased relatives: 78% had drunk water from the pump. Whitehead, however, also reported the prospective contrast: the proportion of deaths among the pump water drinkers was 44% versus 5% among the nondrinkers.6 The directionality of Snow's 1854 investigation of the cholera outbreak in Southern London neighborhoods is also fuzzy.7 Snow's investigation consisted of measuring the odds of being a client of the Southwark and Vauxhall versus the Lambeth companies, among those who died of cholera. Snow did not know the total number of clients supplied by each of the 2 companies in the districts in which he had enumerated and located the cholera deaths; thus, he had no denominators with which to compute risks. He was able, however, to approximate the odds of exposure among those who did not die, using the total number of London houses supplied by the 2 companies in 1853 (40,046 and 26,107, respectively).7,8 While the direction of the contrast was retrospective, Snow reported ratios of cholera deaths by houses among the clients of 2 water companies, as if the contrast had been prospective.
A firmer directionality appears in Broders's 1920 comparison of 537 cases of squamous-cell epithelioma of the lip and 500 “men without [lip] epithelioma.”9 Broders did not mention the control source. Cases (average age, 57 years) smoked more pipes while the much younger non-cases (average age, 36 years) smoked more cigarettes.
The 1926 study by Lane-Claypon was methodologically far more sophisticated than Broders' study. The objective was to compare “a sufficient and suitable series of cases of cancer of the breast” with controls defined as “women whose conditions of life were broadly comparable to those of the cancer series but who had no sign of cancer.”2 (p. v) The cases and control groups were fixed by design, and compared with respect to “specified antecedent conditions.” All controls were patients with no history of cancer, recruited from the same hospitals as the cases, but were not defined as persons who had to be healthy. Indeed, the report says, “the control patients were not healthy women.”2 (p. 26) At the end of Part I of the report,2 (p. 6–18) which was dedicated to “detailed analyses to be quite certain that the women in the control series were in fact controls to the women of the cancer series,”2 (p. 5) Lane-Claypon concluded that the controls could “be regarded as suitable for comparison” as they had similar nationality, age, civil state, occupation, and children mortality.2 (p. 18) Table 2 of Press and Pharaoh's paper1 reproduces these data.
Thus, beyond the limitations mentioned by Press and Pharaoh,1 including the fact that it used prevalent cases, the 1926 study is apparently the first case–control study intentionally designed with a definite directionality.
HOW DOES LANE-CLAYPON′S WORK COMPARE WITH EPIDEMIOLOGIC WORK FROM HER CONTEMPORARIES?
The 1926 study conceals fascinating analytic subtleties. With the help of Greenwood,10 Lane-Claypon used a kind of exposure-propensity equation to check whether the association between breast cancer and lower parity still held after accounting for the case–control differences in age and duration of marriage (Table 1). The rationale was to estimate the expected parity of cases had they had the same marital experience as controls, and vice versa, to estimate the expected parity of controls had they had the same marital experience as cases. The absolute observed minus expected number of children was 3.48 − 4.72 = 1.24 (±0.18) for cases, and 5.34− 3.89 = 1.45 (±0.18) for controls. These contrasts were deemed “highly significant, leaving no doubt at all that the fertility of Cancer series [was] really less than that of the Control series.”2 (p. 46)
Lane-Claypon's attempt to get sensible results from the analyses of “antecedent breast troubles,” such as trauma, is also remarkable. Instead of comparing the 508 cases and 509 controls, she compared the breasts that became cancerous (n = 508) with all the other breasts in the study (n = 1526) (Table 2) Lane-Claypon also accurately described her awareness that case–control differences in “antecedent breast troubles” might be spurious for psychologic reasons, ie, because of recall bias (Table 3).
Most case–control studies subsequent to the 1926 study but prior to 1945 demonstrated a growing understanding of the strengths and pitfalls of case–control studies. In 1928, Lombard and Doehring matched each of their cancer records with the record of an “individual without cancer, of the same sex and approximately the same age.”11 The 1931 study by Wainwright,12 reanalyzed by Press and Pharoah,1 replicated the 1926 study. In 1933, Stocks and Karn filled 2 full pages of the Annals of Eugenics, describing their matched case and control selection, stating in particular that “the control series is not a random sample, but is made to conform to the case series, which is a random sample of a cancerous population.”13 (p. 239)
The 193814,15 and 194316,17 German studies on smoking and lung cancer, conceived and published at the height of Nazi control over German sciences, are exceptions to this enrichment of the design of case–control studies following the 1926 breast-cancer study. Mueller described 86 cases of lung cancer at length, some deceased and some alive, only to say that they were compared with “the same number of healthy men of the same age.”14 Schairer and Schoeniger compared 93 deceased cases of lung cancer, 226 deceased cases of other cancers, and 270 “controls,” all men.16,17 They mentioned only that the controls were men from Jena, “aged 53 and 54 years,” and that 270 of the 700 invited returned satisfactorily-filled questionnaires. These studies had designs far below the standard of their time,13 were mediocre compared with older case–control studies,2,11,12 and marked a regression in quality and rigor relative to pre-Nazi German epidemiology.18,19
Thus, the 1926 study by Lane-Claypon truly appears to have opened a new phase in the history of epidemiology, during which case–control studies emerged and were refined as a specific study design.
HOW DOES THE LANE-CLAYPON STUDY COMPARE WITH MODERN CASE–CONTROL STUDIES?
The studies described above were strictly meant to compare the proportion of exposed cases and controls. This conceptualization probably lingered after 1945,20–22 but the design and appreciation of case–control studies entered a new phase with a 1950 publication by Doll and Hill,23 followed by the theoretical work of Cornfield.24–26 Case–control studies began to be viewed as short-cut designs to estimate risk ratios. In Doll and Hill's 1950 publication, the comparison of smoking histories was amazingly unimpressive: 99.7% of cases versus 95.8% of controls had smoked. Doll and Hill did not compute the odds ratio [(.997/.003)/(.958/.042) = 14], but, extrapolating their control data to the population of Greater London, they transformed the exposure odds into age-and-smoking-specific “risks” and risk ratios. For example, the estimated “relative risk” of lung cancer for smoking 5–14 cigarettes per day compared with not smoking was 19.23
In this sense, the perspective underlying the design of the 1926 UK study was not modern. It was meant to provide the proportion of cases and controls possessing a characteristic, and not to estimate risk ratios.
JANET LANE-CLAYPON′S PLACE IN THE HISTORY OF EPIDEMIOLOGIC METHODS
In 1912, Lane-Claypon published the results of a retrospective cohort study of breast-feeding and child growth conducted in Berlin.27,28 She was therefore fully aware of the palette of possible epidemiologic designs when she conceived her 1926 study. As such, Lane-Claypon epitomizes the transition between phases in the evolution of epidemiologic methods and concepts.29 “Preformal epidemiology,” an older phase in which there was no formal epidemiologic theory, and epidemiologic designs had weak-to-no directionality,6,7,9 led to a new phase, “early epidemiology,” in which cohort and case–control studies became distinct designs,2,11,13,19,27,30,31 and the elaboration of a theory of confounding and bias began.32
ABOUT THE AUTHOR
ALFREDO MORABIA is a Professor of Epidemiology at the City University of New York and Columbia University. He has edited a “History of Epidemiologic Methods and Concepts” (Birkhauser, 2004) and he is an editor at the James Lind Library (www.jameslindlibrary.org), dedicated to the evolution of fair tests of treatments, and at the People's Epidemiology Library (www.epidemiology.ch/history/betaversion.htm), which offers a repository of documents about the history of epidemiology.
Most papers cited in this article can be downloaded (for research and teaching purposes only) from the People's Epidemiology Library (www.epidemiology.ch/history/betaversion.htm). I am grateful to Michael C. Costanza and Jan P. Vandenbroucke for comments on an earlier version of the commentary.
1.Press DJ, Pharoah P. Risk factors for breast cancer: A re-analysis of two case-control studies from 1926 and 1931. Epidemiology
2.Lane-Claypon J. A further report on cancer of the breast: reports on public health and medical subjects. Ministry of Health. London: Her Majesty's Stationary Office; 1926;32:1–189.
3.Porta M. A Dictionary of Epidemiology
. 5th ed. Oxford: Oxford University Press; 2008.
4.Morabia A. Pierre-Charles-Alexandre Louis and the evaluation of bloodletting. [The James Lind Library]. 2004. Available at: www.jameslindlibrary.org
5.Paneth N, Susser E, Susser M. Origin and early development of the case-control study. In: Morabia A, ed. History of Epidemiologic Methods and Concepts
. Basel: Birkhäuser; 2004:291–312.
7.Snow J. On the Mode of Communication of Cholera.
2nd ed. London: Churchill; 1855.
8.Vinten-Johansen P, Brody H, Paneth N, Rachman S, Rip MR. Cholera, Chloroform and the Science of Medicine: A Life of John Snow
. Oxford: Oxford University Press; 2003.
9.Broders AC. Squamous cell epithelioma of the lip: A study of five hundred and thirty seven cases. J Am Med Assoc
11.Lombard HL, Doering CR. Cancer studies in Massachusetts. 2. Habits, characteristics and environment of individuals with and without cancer. N Engl J Med
12.Wainwright J. A comparison of conditions associated with breast cancer in Great Britain and America. Am J Cancer
13.Stocks P, Karn M. A co-operative study of the habits, home life, dietary and family histories of 450 cancer patients and of an equal number of control patients. Ann Eugenics
14.Mueller F. Tabakmissbrauch und Lungenkarcinoma. Z Krebsforsch
15.Mueller F. Abuse of tobacco and carcinoma of the lung. JAMA
16.Schairer E, Schoeniger E. Lung cancer and tobacco consumption. Int J Epidemiol
17.Schairer E, Schoeniger E. Lungenkrebs und Tabakverbrauch. Z Krebsforsch
18.Morabia A, Guthold R. Wilhelm Weinberg's 1913 Large Retrospective Cohort Study: A rediscovery. Am J Epidemiol
19.Weinberg W. Die Kinder der Tuberkulosen
. Leipzig, Germany: S Hirzel; 1913.
20.Wynder E, Graham EA. Tobacco smoking as a possible etiologic factor in bronchiogenic carcinoma: A study of six hundred and eighty four proved cases. J Am Med Assoc
21.Levin ML, Goldstein H, Gerhardt PR. Cancer and tobacco smoking: A preliminary report. J Am Med Assoc
22.Schrek R, Baker LA, Ballard GP, Dolgoff S. Tobacco smoking as an etiologic factor in disease; Cancer. Cancer Res
23.Doll R, Hill AB. Smoking and carcinoma of the lung; preliminary report. Br Med J
24.Cornfield J. A method of estimating comparative rates from clinical data; applications to cancer of the lung, breast, and cervix. J Natl Cancer Inst
25.Schneiderman M, Greenhouse S. Jerome Cornfield, 1912–1979. J Natl Cancer Inst
26.Cornfield J, Haenszel W. Some aspects of retrospective studies. J Chronic Dis
27.Lane-Claypon J. Report to the local government board upon the available data in regard to the value of boiled milk as a food for infants and young animals No 63. Ministry of Health. London: His Majesty's Stationary Office; 1912;63:1–60.
28.Winkelstein W Jr. Vignettes of the history of epidemiology: Three firsts by Janet Elizabeth Lane-Claypon. Am J Epidemiol
29.Morabia A. Epidemiology: An epistemological perspective. In: Morabia A, ed. History of Epidemiological Methods and Concepts
. Basel, Switzerland: Birkhäuser; 2004:1–126.
30.Goldberger J, Wheeler GA, Sydenstricker E. A study of the relation of family income and other economic factors to pellagra incidence in seven cotton-mill villages of South Carolina in 1916. Public Health Rep
31.Frost WH. Risk of persons in familial contact with tuberculosis. Am J Public Health
32.Morabia A. History of the modern concept of epidemiologic confounding. J Epidemiol Community Health.