Observational Studies and Randomized Trials of Hormone Replacement Therapy: What Can We Learn From Them?

Whittemore, Alice S.1; McGuire, Valerie2

In Brief

“…We give the observational studies a grade of B on their performance in charting the short- to moderate-term effects of [HRT]…”

Author Information

From the Department of 1Health Research and Policy and

2Epidemiology, Stanford University School of Medicine, Stanford, CA.

Address correspondence to: Alice Whittemore, Department of Health Research and Policy, Stanford University School of Medicine, Redwood Building, Room T204, Stanford, CA 94305–5405; alicesw@stanford.edu

This work was supported by NIH grant CA69417.

Editors’ note: An editorial and other invited commentaries appear on pages 2, 3 and 6.

Article Outline

The announcement last summer that the risks outweigh the benefits of postmenopausal hormone replacement therapy (HRT) took many by surprise. A large randomized trial by the National Institutes of Health–funded Women's Health Initiative (WHI) 1 found that Prempro, a combination of estrogen and progestin often prescribed to postmenopausal women, increases the risk of breast cancer, coronary heart disease (CHD), stroke and pulmonary embolism. The drug reduces risk for bone fractures and colon cancer, but not enough to outweigh the accompanying risks.

Why had this drug been prescribed so ubiquitously? Had women and their physicians been misled by the observational studies? Here we take a brief look at the findings of case-control and cohort studies on HRT in relation to breast cancer, CHD, stroke, bone fractures and colon cancer. We shall see that, with the notable exception of CHD, the observational studies fare well; the observational studies predicted the WHI findings for all of the other endpoints.

Back to Top | Article Outline
Observational Studies vs Trials: How Well Do They Agree?

Shortly after publication of the WHI findings, Nelson et al. 2 reviewed the evidence on benefits and risks of HRT for the primary prevention of breast cancer, CHD, thromboembolism, osteoporosis and colon cancer. For breast cancer, the data reviewed by these authors suggest that current estrogen users have 20%–40% increased risk 3–5 and that the risk increases with duration of use. 3–7 However, Nelson and colleagues report that observational studies and several meta-analyses show no risk differences between women with any prior estrogen use compared with never-users. 3–9

The review by Nelson et al. 2 also found substantial increases in stroke incidence (relative risk [RR] = 1.12; 95% confidence interval [CI] = 1.01–1.23), but not mortality, in ever-users of HRT compared with never-users. A combined analysis of 12 studies including three randomized clinical trials revealed a two-fold increase in risk for deep vein thrombosis and pulmonary embolism among current HRT users (RR = 2.1; 1.6–2.8). 2 They also noted a nearly four-fold increase in risk within the first year of use (RR = 3.5; 2.3–5.6).

Several observational studies reviewed by Nelson et al. 2 found statistically substantial reductions in osteoporotic bone fracture risk associated with HRT use. Two cohort studies reported a 60% reduction in risk for wrist fractures 10,11 and a 40% reduction in risk for vertebral fractures 11 among current or ever-users of HRT compared with never-users. A combined analysis of six cohort studies showed a nonsubstantial reduction in hip-fracture risk for current users of HRT (RR = 0.64; CI = 0.32–1.04) or ever-users of HRT (RR = 0.76; 0.56–1.01) compared with never-users. 10–15

Current or ever-users of HRT are also at reduced risk for colon cancer compared with never-users, based on a meta-analysis of 18 observational studies. 16

The big difference between the observational studies and the WHI findings concerns the effects of HRT on heart disease. Based on their meta-analysis of 21 observational studies, Nelson et al. 2 noted reductions in CHD incidence (RR = 0.8; CI = 0.68–0.95) and mortality (RR = 0.62; 0.40–0.90) among current users of HRT compared with never-users. However, estimated risk reductions did not achieve statistical significance in past users or ever-users of HRT. Moreover, statistically substantial reductions were not seen when the authors analyzed only those studies that controlled for socioeconomic status (RR = 0.91; CI = 0.67–1.3). 17–20 The possibility of confounding was further suggested when Nelson et al. 2 restricted analysis to studies that controlled for alcohol consumption, physical activity and other CHD risk factors and again found no substantial association between HRT and CHD. 18–21

Back to Top | Article Outline
Why the Discrepancies for CHD?

The most plausible explanation for the disagreement between the observational studies and the WHI findings for CHD is residual confounding in the observational studies. Compared with never-users, HRT users in virtually every observational study were better educated, leaner, more physically active, less likely to smoke, more health conscious and more likely to seek medical care. It is noteworthy that the protective effect of HRT did not achieve statistical significance when the meta-analysis of Nelson et al. 2 was restricted to those observational studies that controlled for socioeconomic status.

Back to Top | Article Outline
Lessons to Be Learned

The first lesson is that it is not yet time for epidemiologists to abandon their fieldwork and become trialists. The good agreement between the observational studies and the trial on endpoints other than CHD confirms the utility and validity of observational studies as monitors of new preventive agents. Moreover, randomized trials cannot evaluate the long-term effects of preventive agents; we will continue to need observational studies to provide this important information. Indeed, long-term observational follow-up of the WHI participants for delayed effects will provide essential data on the lifetime risks and benefits of HRT use among former users.

What can we learn from the CHD discrepancy? Two lessons come to mind. The first is that large sample sizes may be necessary, but they are not sufficient for accurate answers. Bigger need not be better. In the presence of unrecognized residual confounding, the precision in risk estimates gained by large sample sizes can be misleading: a narrow confidence interval can be far removed from the true measure of association. Indeed, the danger of impressively tiny P-values and of impressively narrow confidence intervals that lie far from the mark is a well-known pitfall of meta-analysis. Epidemiology is grounded on the principle of confirming findings by repeating them in different populations and different settings. Nevertheless, the same residual confounding may plague all of the data. Indeed, the consistency of the observational studies may even make it impossible to launch a costly randomized trial. During the planning of the WHI, for instance, the view that HRT prevents heart disease was so entrenched that some argued that it would be unethical to deny some women the drug and instead give them a placebo.

The second lesson to be learned from the CHD discrepancy is that intermediate endpoints do not tell the whole tale. The observational data showing a CHD risk reduction in HRT users were backed by many “mechanistic” studies showing a favorable effect on markers such as cholesterol levels and atherosclerosis. But these endpoints are not synonymous with CHD, and HRT may act adversely on other endpoints that also contribute to risk. Thus, reliance on an intermediate endpoint may be misleading. For example, fluoride supplements were thought to reduce the risk of osteoporotic fractures because fluoride increases bone mineral density. 22 On the contrary, however, randomized trials have indicated no effect on spinal fractures despite a marked increase in spinal bone mass, and they have found an increase in fractures in areas outside the spine. 23–25 Denser bones are not necessarily stronger ones; the bone formed during the administration of fluoride may be structurally abnormal because of defective mineralization of the newly synthesized bone. 26

The utility of an intermediate endpoint rests on its predictive power for the ultimate outcome of interest. 27 As intermediaries between HRT and CHD, lipid levels score fairly well on this front. However, using them as surrogates for CHD oversimplifies a multifaceted problem. Instead, we need to consider a “systems approach” to the effects of perturbing one or more of the many metabolic pathways leading to a complex disease such as CHD. 28

Thus, using the WHI findings as a gold standard, we give the observational studies a grade of B on their performance in charting the short- to moderate-term effects of combined estrogen and progestin in postmenopausal women. Future challenges include continued rigorous attention to the pitfalls of confounding in observational studies and a healthy skepticism about the predictive power of intermediate endpoints.

Back to Top | Article Outline
About the Authors

ALICE S. WHITTEMORE conducts research on the genetic epidemiology of cancers of the prostate, ovary and breast. She is Professor of Epidemiology and Biostatistics at Stanford University School of Medicine. She is a member of the Institute of Medicine, a Fellow of the American Association for the Advancement of Science, a Fellow of the American Statistical Association and a member of the American Epidemiological Society.

VALERIE MCGUIRE has published on the epidemiology of site-specific cancers, neurologic disorders and cardiovascular disease.

Back to Top | Article Outline


1. The Women's Health Initiative. Risks and benefits of estrogen plus progestin in healthy postmenopausal women: principal results. The Women's Health Initiative Randomized Controlled Trial. JAMA 2002; 288: 321–333.
2. Nelson HD, Humphrey LL, Nygren P, Teutsch SM, Allan JD. Postmenopausal hormone replacement therapy: scientific review. JAMA 2002; 288: 872–881.
3. Sillero-Arenas M, Delgado-Rodriguez M, Rodigues-Canteras R, Bueno-Cavanillas A, Galvez-Vargas R. Menopausal hormone replacement therapy and breast cancer: a meta-analysis. Obstet Gynecol 1992; 79: 286–294.
4. Colditz GA, Egan KM, Stampfer MJ. Hormone replacement therapy and risk of breast cancer: results from epidemiologic studies. Am J Obstet Gynecol 1993; 168: 1473–1480.
5. WHO Collaborative Study of Cardiovascular Disease and Steroid Hormone Contraception. Acute myocardial infarction and combined oral contraceptives: results of an international multicentre case-control study. Lancet 1997; 349: 1202–1209.
6. Grady D, Rubin SM, Petitti DB, et al. Hormone therapy to prevent disease and prolong life in postmenopausal women. Ann Intern Med 1992; 117: 1016–1037.
7. Steinberg KK, Smith SJ, Thacker SB, Stroup DF. Breast cancer risk and duration of estrogen use: the role of study design in meta-analysis. Epidemiology 1994; 5: 415–421.
8. Armstrong BK. Oestrogen therapy after the menopause–boon or bane? Med J Aust 1988; 148: 213–214.
9. Dupont WD, Page DL. Menopausal estrogen replacement therapy and breast cancer. Arch Intern Med 1991; 151: 67–72.
10. Cauley JA, Seeley DG, Ensrud K, Ettinger B, Black D, Cummings SR. Estrogen replacement therapy and fractures in older women. Study of Osteoporotic Fractures Research Group. Ann Intern Med 1995; 122: 9–16.
11. Maxim P, Ettinger B, Spitalny GM. Fracture protection provided by long-term estrogen treatment. Osteoporos Int 1995; 5: 23–29.
12. Kiel DP, Felson DT, Anderson JJ, Wilson PW, Moskowitz MA. Hip fracture and the use of estrogens in postmenopausal women. The Framingham Study. N Engl J Med 1987; 317: 1169–74.
13. Naessen T, Persson I, Adami HO, Bergstrom R, Bergkvist L. Hormone replacement therapy and the risk for first hip fracture. A prospective, population-based cohort study. Ann Intern Med 1990; 113: 95–103.
14. Grodstein F, Stampfer MJ, Falkeborn M, Naessen T, Persson I. Postmenopausal hormone therapy and risk of cardiovascular disease and hip fracture in a cohort of Swedish women. Epidemiology 1999; 10: 476–480.
15. Hoidrup S, Gronbaek M, Gottschau A, Lauritzen JB, Schroll M. Alcohol intake, beverage preference, and risk of hip fracture in men and women. Copenhagen Centre for Prospective Population Studies. Am J Epidemiol 1999; 149: 993–1001.
16. Grodstein F, Newcomb PA, Stampfer MJ. Postmenopausal hor-mone therapy and the risk of colorectal cancer: a review and meta-analysis. Am J Med 1999; 106: 574–582.
17. Croft P, Hannaford PC. Risk factors for acute myocardial infarction in women: evidence from the Royal College of General Practitioners’ oral contraception study. BMJ 1989; 298: 165–168.
18. Rosenberg L, Palmer JR, Shapiro S. A case-control study of myocardial infarction in relation to use of estrogen supplements. Am J Epidemiol 1993; 137: 54–63.
19. Sidney S, Petitti DB, Quesenberry CP Jr. Myocardial infarction and the use of estrogen and estrogen-progestogen in postmenopausal women. Ann Intern Med 1997; 127: 501–508.
20. Sourander L, Rajala T, Raiha I, Makinen J, Erkkola R, Helenius H. Cardiovascular and cancer morbidity and mortality and sudden cardiac death in postmenopausal women on oestrogen replacement therapy (ERT) [published correction appears in Lancet 1999;353: 330]. Lancet 1998; 352: 1965–1969.
21. Wilson PW, Garrison RJ, Castelli WP. Postmenopausal estrogen use, cigarette smoking, and cardiovascular morbidity in women over 50. The Framingham Study. N Engl J Med 1985; 313: 1038–1043.
22. Farley JR, Wergedal JE, Baylink DJ. Fluoride directly stimulates proliferation and alkaline phosphatase activity of bone-forming cells. Science 1983; 222: 330–332.
23. Riggs BL, Hodgson SF, O'Fallon WM, et al. Effect of fluoride treatment on the fracture rate in postmenopausal women with osteoporosis. N Engl J Med 1990; 322: 802–809.
24. Kleerekoper M, Peterson EL, Nelson DA. A randomized trial of sodium fluoride as a treatment for postmenopausal osteoporosis. Osteoporo Int 1991; 1: 155–161.
25. Meunier PJ, Sebert JL, Reginster JY, et al. Fluoride salts are no better at preventing new vertebral fractures than calcium-vitamin D in postmenopausal osteoporosis: the FAVO study. Osteoporos Int 1998; 8: 4–12.
26. Kanis JA. Osteoporosis. London: Blackwell Science, 1994; 204.
27. Hulka BS, Wilcosky TC, Griffith JD. Biological Markers in Epidemiology. New York: Oxford University Press, 1990.
28. Strohman R. Maneuvering in the complex path from genotype to phenotype. Science 2002; 296: 701–703.

Cited By:

This article has been cited 3 time(s).

Current Opinion in Clinical Nutrition & Metabolic Care
Intake of fruit and vegetables and risk of stroke: an overview
Johnsen, SP
Current Opinion in Clinical Nutrition & Metabolic Care, 7(6): 665-670.

PDF (82)
Epidemiology and Randomized Clinical Trials
The Editors,
Epidemiology, 14(1): 2.

PDF (52)
Heuristic Thinking and Inference From Observational Epidemiology
Lash, TL
Epidemiology, 18(1): 67-72.
PDF (226) | CrossRef
Back to Top | Article Outline
© 2003 Lippincott Williams & Wilkins, Inc.