Share this article on:

Estimating Absolute Risks in the Presence of Nonadherence: An Application to a Follow-up Study With Baseline Randomization

Toh, Sengweea; Hernández-Díaz, Soniaa; Logan, Rogera; Robins, James M.a,b; Hernán, Miguel A.a,c

doi: 10.1097/EDE.0b013e3181df1b69
Methods: Original Article

The intention-to-treat (ITT) analysis provides a valid test of the null hypothesis and naturally results in both absolute and relative measures of risk. However, this analytic approach may miss the occurrence of serious adverse effects that would have been detected under full adherence to the assigned treatment. Inverse probability weighting of marginal structural models has been used to adjust for nonadherence, but most studies have provided only relative measures of risk. In this study, we used inverse probability weighting to estimate both absolute and relative measures of risk of invasive breast cancer under full adherence to the assigned treatment in the Women's Health Initiative estrogen-plus-progestin trial. In contrast to an ITT hazard ratio (HR) of 1.25 (95% confidence interval [CI] = 1.01 to 1.54), the HR for 8-year continuous estrogen-plus-progestin use versus no use was 1.68 (1.24 to 2.28). The estimated risk difference (cases/100 women) at year 8 was 0.83 (−0.03 to 1.69) in the ITT analysis, compared with 1.44 (0.52 to 2.37) in the adherence-adjusted analysis. Results were robust across various dose-response models. We also compared the dynamic treatment regimen “take hormone therapy until certain adverse events become apparent, then stop taking hormone therapy” with no use (HR = 1.64; 95% CI = 1.24 to 2.18). The methods described here are also applicable to observational studies with time-varying treatments.


From the Departments of aEpidemiology, bBiostatistics, Harvard School of Public Health, Boston, MA; and cHarvard-MIT Division of Health Sciences and Technology, Boston, MA.

Submitted 19 July 2009; accepted 4 January 2010.

Supported and partially funded by National Institutes of Health (NIH) grant R01 HL080644–01.

Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article (

Correspondence: Sengwee Darren Toh, Department of Population Medicine, Harvard Medical School, Harvard Pilgrim Health Care Institute, 133 Brookline Avenue 6th Floor, Boston, MA 02215. E-mail:

The primary analysis of most randomized trials follows the intention-to-treat (ITT) principle. The ITT analysis is favored because it provides a valid test of the null hypothesis in placebo-controlled trials–bypassing the problems associated with imperfect adherence to the assigned treatment–and because the absence of adjustment for covariates naturally yields both absolute and relative measures of risk.

However, the ITT effect is a biased estimate of any true non-null effect that would have been observed under full adherence to the assigned treatment.1 The greater the nonadherence to the assigned treatment, the closer to the null the ITT effect is expected to be in placebo-controlled studies. Thus, in studies whose goal is evaluating a treatment's safety, one could naïvely conclude that a treatment is safe because the ITT effect is close to null, even if the treatment causes serious adverse effects that would have been detected in the absence of nonadherence.

To deal with nonadherence, one can attempt to estimate the effect that would have been observed had all study participants adhered to their assigned treatment throughout the follow-up, sometimes referred to as the effect of continuous treatment. Inverse probability weighting can be used to consistently estimate the effect of continuous treatment,2–5 but only under exchangeability and modeling assumptions that are not required to estimate the ITT effect. G-estimation of structural nested models that uses assigned treatment as an instrumental variable can also be used under a different set of assumptions.4,6–8 The wish to conduct an analysis whose validity does not rely on those assumptions might explain the widespread use of the ITT analysis, despite its shortcomings.

Here, we describe the application of inverse probability weighting of marginal structural models to estimate both absolute and relative measures of risk under full adherence. To illustrate the use of inverse probability weighting, we estimated the effect of continuous postmenopausal hormone therapy on the risk of invasive breast cancer (henceforth, breast cancer) in the Women's Health Initiative estrogen-plus-progestin trial.

Back to Top | Article Outline


Study Population

The Women's Health Initiative estrogen-plus-progestin trial was a double-blinded, placebo-controlled, and multicentered primary-prevention trial in which 16,608 postmenopausal women aged 50–79 years with an intact uterus at baseline were randomized to either a daily hormone regimen of 0.625 mg conjugated equine estrogens plus 2.5 mg medroxyprogesterone acetate (n = 8506) or matching placebo (n = 8102) between 1993 and 1998.9 A detailed description of the trial has been published elsewhere.9,10 The limited access dataset we used (obtained from the National Heart, Lung, and Blood Institute) includes follow-up information through 7 July 2002, for an average follow-up of 5.6 years.

Data were collected at baseline and during the follow-up period on demographic characteristics; medical, reproductive, and family history; hormone use; dietary intake; and physical examinations. Safety and adherence data were recorded first at 6 weeks after randomization, followed by scheduled semi-annual interviews and annual clinical visits during which health-related information was also updated. For each follow-up year, the dataset contains indicators for discontinuation of assigned study pills and initiation of non-study hormone therapy, as well as the proportion of study pills taken (estimated by weighing of returned bottles) and self-reported frequency of use. (We recoded doses less than 1% in a given year as zero.) Physician adjudicators at local clinics first confirmed self-reported breast cancer cases by reviewing medical records and pathology reports, and all cases were subsequently centrally adjudicated using the Surveillance, Epidemiology, and End Results (SEER) coding system.11,12

Back to Top | Article Outline

ITT Analysis

For quality assurance purposes, we first replicated the published estimates of the average ITT hazard ratio of hormone therapy versus placebo12,13 through a pooled logistic regression model14 that included months since randomization (modeled by cubic splines), age at baseline, and randomization status in a parallel diet-modification trial. We estimated the average ITT hazard ratio over the first 2-, 6-, and 8-year periods. We repeated the analysis stratified by prior hormone use (yes, no), age (<60 and ≥60 years old), years since menopause (<10 and ≥10 years).

To estimate breast cancer-free survival curves for initiators and noninitiators, we followed 2 separate approaches. First, we constructed unadjusted Kaplan-Meier survival curves, as previously reported by the Women's Health Initiative investigators.12,13 Second, we estimated standardized survival curves from a pooled logistic regression model that allowed for time-varying hazard ratios by including product (“interaction”) terms between treatment arm and months since randomization, in addition to the covariates listed above. To estimate the standardized (by the baseline covariates) survival at month m for therapy initiation, we multiplied the predicted conditional probabilities of surviving to month tm given survival through t-1 for each study participant including those in the placebo arm (by setting the treatment arm indicator equal to 1 in the fitted pooled logistic regression model), and then averaged the estimate across all participants. To estimate the standardized survival under no initiation, we repeated the process with the treatment arm indicator set equal to 0. In the absence of model misspecification, the unadjusted and the standardized curves should be similar.

We then computed, from each standardized curve, the estimated absolute risk of breast cancer (cumulative incidence) as 1 minus the disease-free survival at 2, 6, and 8 years, and the corresponding risk differences for initiators versus noninitiators. The 95% confidence interval (CI) for the risk difference was estimated by 200 bootstrap samples (with replacement).

Back to Top | Article Outline

Adherence-adjusted Analysis

We estimated the average hazard ratio for continuous hormone use versus no use (an adherence-adjusted effect) using inverse probability weighting.2–5 Informally, the method weights each woman at each time period by the inverse of the density of having received her actual treatment history through that time. The density was computed as the product of the probability of receiving any hormone therapy during each follow-up year and, for those with non-zero use during that year, the density of receiving the proportion of pills she actually took. We estimated these quantities by fitting, separately for each arm, (1) a logistic regression model to estimate the probability of receiving any hormone therapy, and (2) a linear regression model that assumed independent normal errors with constant variance to estimate the density of receiving the log-transformed proportion of pills taken among those with nonzero use during that year.15,16 A participant contributed as many observations to the models as years she was in the study, ie, from baseline to the occurrence of breast cancer, death, loss to follow-up, or end of study, whichever occurred first.

Both models included years since randomization (linear and quadratic) and covariates listed in Table 1 measured at baseline and, for time-varying covariates, at the most recent visit. Fitting a richer model with body pain (none, very mild, mild, moderate/severe) or use of statins, aspirin, selective estrogen receptor modulators, bisphosphonates, or multivitamin (yes, no) did not materially change the results (not shown). As noted above, we assumed a normal density for the error term in the linear regression model, with a constant variance across all combinations of the covariates. The results (not shown) were similar when we used alternative distributions (gamma, log-normal), when we used an arcsin-root transformation instead of log transformation, or when we allowed the error variance to depend on a subset of the covariates.



To improve statistical efficiency, the weights were stabilized2–5,17 by setting the numerator weight equal to the estimated probability (density) of received treatment history conditional on the baseline covariates included in the ITT model plus the following subset of the baseline covariates used in the model for the denominator of the weights: race/ethnicity, marital status, body mass index, physical activity, cigarette smoking, alcohol intake, parity, age at menarche, family history of breast cancer, mammography use, presence of vasomotor symptoms, use of oral contraceptive, prior hormone use, and years since menopause. Adding to the numerator weight model the covariates (measured at baseline) family history of fracture, region, education level, number of children breast-fed, age at first birth, personal history of benign breast diseases, bilateral oophorectomy, general health, fruit and vegetable intake, and number of first-degree relatives with breast cancer yielded similar results (not shown). The mean of the estimated stabilized inverse probability weights for adherence adjustment was 1.00 (standard deviation = 0.29). We did not attempt to adjust for selection bias arising from differential drop-out2–4 because of the low proportion of participants lost to follow-up (3.3%).

We then fitted a weighted model identical to the one used to estimate average ITT hazard ratios, except that (1) the treatment arm indicator was replaced by a time-varying covariate for cumulative use of hormone therapy, calculated as the sum of the annual proportion of pills taken since baseline, (2) included the additional baseline covariates used to estimate the numerator of the weights, and (3) included product terms between cumulative use and months since randomization. We then computed the subject-specific predicted survival probabilities under this structural model, first with the dose (proportion) set to 1 at all times and again with the dose set to 0 at all times, and then averaged these survival probabilities over all participants. We then estimated risks and risk differences as described above for the unweighted model used in the ITT analysis.

To estimate the average hazard ratios commonly reported in longitudinal studies, we used the parameter estimates from the model to stimulate a Monte Carlo sample of 100,000 women with no censoring. We used a nonparametric bootstrap estimator18 to calculate conservative 95% CIs for the average hazard ratio. We added product terms between cumulative use and indicators for prior hormone use (yes, no), age (<60, ≥60 years), and time since menopause (<10, ≥10 years) to obtain the corresponding stratum-specific estimates of the average hazard ratio. Results were similar under a number of sensitivity analyses described in Appendix 1.

Note that we specified a structural model with a dose-response function (cumulative use of hormone therapy), rather than using a “model-free” approach in which participants are censored when they become nonadherent, as in previous studies.19–21 The model-free approach could not be used because the Women's Health Initiative data did not include sufficient information to establish the temporal sequence between nonadherence to assigned treatment and breast cancer diagnosis within each 1-year observation period. On the other hand, the uncertainty surrounding the time sequence has a small effect on our estimates because (1) the uncertainty is restricted to a short interval during which only a relatively small amount of treatment can be taken, and (2) our dose-response function (cumulative treatment use) assumes that a small amount of treatment has little effect compared with a large amount of treatment. “Average treatment use” is an example of an alternative dose-response function that meets condition 2. “Current treatment use,” is an example that does not.

This application extends previous implementation of inverse probability weighting for survival analysis5,22 to the case of time-varying treatments and dose-response functions for the effect of treatment on disease-free survival. Annotated SAS programs used for this analysis can be found in the eAppendix (

Back to Top | Article Outline

Secondary Adherence-adjusted Analysis

In the Women's Health Initiative study protocol, participants were required to permanently stop their assigned hormone therapy if they developed the following events: deep vein thrombosis, pulmonary embolus, endometrial hyperplasia with atypia, malignant melanoma, endometrial cancer, breast cancer, triglycerides above 1000 mg/dL, or starting anticoagulant medications, estrogen, progesterone, testosterone, tamoxifen, or other selective estrogen-receptor modulators. We performed a secondary analysis to estimate the survival curve that would have been observed if all women in the hormone therapy arm had fully complied with this protocol. More specifically, we estimated survival under the dynamic regimen “take hormone therapy until one of the above events occurs, then stop taking hormone therapy.” To do so, we artificially censored participants in the hormone arm at the time they deviated from the protocol (ie, did not stop taking their assigned study hormone after they had one of these events). This artificial censoring may result in selection bias because the distribution of risk factors of breast cancer may differ between the censored and the uncensored.

To adjust for such potential selection bias, one would estimate time-varying, subject-specific inverse probability weights whose denominator is the subject's estimated probability of remaining uncensored at each time, conditional on past joint predictors of censoring and the outcome. Note, however, that the predictors of censoring at time t are in fact the predictors of hormone therapy continuation at t because those who continue to take their study pills are precisely those who are censored. Therefore, there is no need to estimate separate inverse probability weights to adjust for selection bias due to artificial censoring because the treatment weights estimated in the primary analysis already adjust for the potential time-varying selection bias due to artificial censoring. Also note that most protocol-mandated reasons for stopping treatment were not risk factors for breast cancer and thus need not be used to estimate the weights.

The specification of a marginal structural model for dynamic regimens is straightforward when only 2 regimens are compared23 or when the goal of the structural model is to smooth over a set of regimens that can be placed in 1-to-1 correspondence with an indexing continuous variable.8,24 The situation is not so straightforward when, as is required in this example, the dose-response function depends on a summary dose measure (such as cumulative use) that can take the same value for many different regimens. Further discussion and caveats are provided in Appendix 2. For simplicity, we used the same specification as in the primary analysis; specifically, we assumed the log discrete hazard is a linear function of cumulative dose.

Back to Top | Article Outline


ITT Estimates

We reproduced the ITT hazard ratio estimates published previously by the Women's Health Initiative investigators (Table 2). Compared with women assigned to placebo, women assigned to estrogen-plus-progestin were 25% more likely to be diagnosed with breast cancer during the first 8 years of follow-up, but 29% less likely to have the diagnosis during the first 2 years. Only 2 cases occurred after 8 years, and thus the hazard ratios over the entire study period (not shown) were virtually identical to those in Table 2.



Figure 1 shows breast cancer-free survival curves for initiators versus noninitiators among all women, as well as women without and with prior hormone use. Kaplan-Meier (unadjusted) and standardized (adjusted) curves were similar, which suggests that our model for the standardized curves was adequately specified. The P value from a log-rank test for the equality of the survival curves between initiators and noninitiators was 0.04 for all women, 0.45 for women without prior hormone use, and 0.01 for women with prior hormone use. The ITT risk differences at 2, 6, and 8 years are shown in Table 2. By the end of 8 years of follow-up, the estimated risk difference (cases per 100 women) of breast cancer was 0.83 (95% confidence interval [CI] = −0.03 to 1.69).



Back to Top | Article Outline

Adherence-adjusted Estimates

A selected list of potential predictors of hormone use is shown in Table 3 (with all covariates coded as categorical for simplicity). Women aged 70 years or more and nonwhites were less likely to use hormone therapy in both treatment arms. Examples of time-varying predictors of hormone use in the treatment arm include normal mammogram/breast examination (odds ratio of receiving hormone was 0.3 for participants with vs. without abnormal results); no new lumps, nipple discharge, or skin changes (odds ratio = 0.6); oophorectomy (odds ratio = 1.6); weight loss (odds ratio = 0.7 for women whose body mass index decreased by more than 0.5 kg/m2 from baseline); and changes in physical activity (odds ratio = 0.7 for women who reduced their physical activity level by more than 0.5 metabolic equivalent units/week).



In general, the adherence-adjusted hazard ratios (Table 4) were further away from the null than the ITT estimates, but the corresponding 95% CIs were also wider. The adherence-adjusted survival curves (Fig. 2) crossed at about 4 years in women without prior hormone use before randomization, and diverged at approximately 1 year after initiation in women with prior hormone use. The P value from a log-rank test for the equality of the survival curves was <0.0001 for all women, <0.01 for women without prior hormone use, and <0.001 for women with prior hormone use. By the end of 8 years of follow-up, the estimated risk difference (cases per 100 women) of breast cancer was 1.44 (95% CI = 0.52 to 2.37) (Table 4). Risk differences obtained from different dose-response models (Table 5) were qualitatively similar.







In our secondary adherence-adjusted analysis, which considered a dynamic treatment regime, the hazard ratio for 8-year continuous hormone use was 1.64 (95% CI = 1.24 to 2.18) for all women, 1.59 (1.13 to 2.24) for women without prior hormone use, and 2.04 (1.34 to 3.10) for those with prior hormone use. The corresponding 8-year risk differences (cases per 100 women) were 1.56 (95% CI = 0.68 to 2.44), 1.45 (0.30 to 2.60), and 1.61 (0.60 to 2.62).

Back to Top | Article Outline


We have presented an application of inverse probability weighting to adjust for incomplete adherence to the assigned treatment in randomized trials. Our analysis estimated that, relative to no hormone therapy, the incidence of breast cancer under 8-year continuous estrogen-plus-progestin therapy was 1.7 times greater (the ITT estimate was 1.3). In absolute terms, we estimated that continuous use of estrogen-plus-progestin therapy for 8 years caused an excess of 1.4 breast cancer cases per 100 women (the ITT estimate was 0.8). Interestingly, women on estrogen-plus-progestin therapy had a lower incidence of breast cancer during the first 2 years of use, which is consistent with the hypothesis that estrogen-plus-progestin use delays breast cancer diagnosis, possibly by compromising the diagnostic performance of mammogram and breast biopsies.12,25 Thus, as in previous Women's Health Initiative analyses,12,13 it may be more appropriate to say that our analysis estimated the effect of continuous therapy on the diagnosis, rather than the true incidence, of breast cancer.

Inverse probability weighting adjusts for the joint time-varying predictors of adherence and breast cancer that were measured in the study, but cannot adjust for unmeasured predictors of adherence.8 G-estimation (a general form of instrumental variable analysis) is another approach that can adjust for measured and unmeasured predictors6–8 and that has been applied in several randomized trials.4,26–31 We have previously compared the assumptions for the validity of adherence-adjusted estimates when using adjusted inverse probability weighting and g-estimation in randomized trials in which the outcome of interest is a continuous variable.4

In contrast with inverse probability weighting and g-estimation, other adjustment methods may not appropriately adjust for measured predictors of adherence and survival.32 For example, a previously used approach applies to the Women's Health Initiative data censored participants 6 months after they took <80% of their study pills or started receiving non-study hormone. The hazard ratio of breast cancer was 1.49 (95% CI = 1.13 to 1.96).12 Such approach requires the choice of an arbitrary censoring period, as well as the assumption that reasons for stopping are not related to time-varying risk factors for breast cancer and that the measured time-varying risk factors that also predict future adherence and survival are not themselves affected by prior adherence. This assumption may be violated if, for example, women in the hormone-treatment arm were less likely to adhere to their assigned treatment. In this study, 42% of the women in the hormone arm stopped taking their study pills some time during the follow-up, compared with 38% in the placebo arm; 6% and 11%, respectively initiated non-study hormone therapy.9 Further, women with joint determinants of hormone use and breast cancer that are also affected by previous hormone therapy, such as abnormalities in mammogram and breast examination, were less likely to adhere to their assigned treatment.

Besides appropriately adjusting for the measured time-varying determinants of adherence, our analytic approach allows estimation of the absolute risk under continuous therapy and the exploration of the sensitivity of the estimates to the functional form of dose-response function. Had the available Women's Health Initiative data been sufficient to establish the temporal sequence between nonadherence and breast cancer diagnosis, we could have done away with the requirement to specify a dose-response function, at the expense of wider confidence intervals for our estimates.4 In previous studies19,20 we did not have to specify a dose-response function because the temporal sequence could be inferred from the data.

Both inverse probability weighting and g-estimation can be extended to the estimation of the effect of dynamic treatment regimens.8,23 This extension is crucial because in some cases estimating the effect of continuous treatment (a nondynamic or static regimen) may be of little interest. For example, if many participants stop taking the treatment because it causes serious adverse effects, one would not want to estimate the effect under the statistic regimen “always adhere to the baseline treatment” (the effect of continuous treatment) but rather under the dynamic regimen “adhere to the baseline treatment unless adverse effects become apparent.” Further, when certain types of participants will always discontinue treatment given certain adverse events, then estimating the effect under statistic regimens such as “always adhere to the baseline treatment” is problematic because the positivity assumption is violated.17,33

The Women's Health Initiative protocol required participants who experienced certain adverse events to stop permanently their assigned hormone treatment. A treatment regimen that changes with patients' prognosis and response to previous therapy may increase the effectiveness of the treatment or may reduce the adverse effects associated with the treatment. In our breast cancer analysis, the effect on breast cancer risk of the dynamic regimen “adhere to the baseline hormone treatment unless adverse effects occurred” was similar to the effect of continuous hormone use.

Loss to follow-up is another common problem in randomized trials. In the presence of loss to follow-up, an analysis under the ITT principle is not feasible because some participants' outcomes are unknown. As a result, some studies use a “pseudo-ITT” analysis4 that does not preserve the desirable properties of the ITT analysis. Inverse probability weighting can also be used to adjust for selection bias due to differential loss to follow-up,2–4 but that was not necessary in this study because few participants dropped out.

In conclusion, we described the application of inverse probability weighting of a dose-response marginal structural model to estimate both absolute and relative measures of effect on a failure-time outcome. Although we focused on randomized trials for simplicity, the methods described here are also applicable to observational studies with time-varying treatments.

Back to Top | Article Outline


Primary Analysis: Sensitivity Analyses

Missing Data on Adherence

For women with missing proportion of study pills taken as estimated by weighing of returned bottles (28% of the total person-time), we estimated this proportion based on their self-reported frequency of use (none, <1, 1–2, 3–4, 5–6, 7 days/week). When the self-report use was also missing (71% of the person-time in women with missing estimated proportion of study pills), we randomly assigned a dose for that year using a uniform distribution. In addition, because the proportion of pills taken is unknown for women who initiated non-study hormone therapy, we randomly assigned a dose for such use. We repeated our analysis under various assumptions for the imputation of missing study and non-study hormone use (using all, half, or none of the usual proportion of pills, or using the same proportion as the previous year). Results (not shown) were similar under all these assumptions. We were also not able to accurately identify the dose a participant received during the year she had the outcome. A number of sensitivity analyses were performed to test the robustness of our results. These included assuming (1) all participants stopped taking the study pills at the time of outcome; (2) all participants kept taking the study pills until the end of the year; and (3) participants stopped taking the study pills at a random month. Our results were similar under these different scenarios. The third approach was used as the main analysis.

Back to Top | Article Outline

Density Estimation

We evaluated the sensitivity of our estimates to the method of density estimation by varying several components of the analysis separately. First, we used a gamma, rather than a normal, distribution of the log-transformed proportion of dose received. Second, we considered a log-normal distribution for the original, nontransformed, proportion of dose received. Third, we used an arcsin-root, rather than a log, transformation. For women with both study and non-study hormone use whose combined proportion of hormone use was greater than 1, we recoded their proportion as 1. Fourth, we relaxed the assumption of constant variance across treatment and covariate histories. We estimated the conditional variances by regressing the squared residuals from the linear model for the (log-transformed) dose on the covariates. Results were similar under all these models although, when estimating conditional variances, we had to restrict the analysis to a subset of the covariates to avoid extreme stabilized weights.

Back to Top | Article Outline

Dose-Response Function

To assess the robustness of our estimates to the dose-response function, we also considered models that included quadratic term of cumulative use, average linear cumulative use, or average linear and quadratic cumulative use. Results were qualitatively similar under all these models (Table 5).

Back to Top | Article Outline


Secondary Analysis: Model Specification and Dynamic Treatment Regimens

Given a static treatment regimen ā = {at; t0}, let the dynamic treatment regimen d(ā) be the regimen “follow the static regimen ā until the occurrence of an adverse event; then take no more hormone therapy.” The marginal structural model used in our secondary adherence-adjusted analysis is formally a model for the logit of the discrete hazard of the counterfactual time to breast cancer diagnosis Td(ā) under regimen d(ā) given the baseline covariates V. Specifically, the simplest version of the model assumes that logit Pr(Td(ā)m +1 | Td(ā)>m, V)=β0(m)+β1cum(ām)+β2V where cum(ām) = ∑j=0 m aj is the cumulative dose under regimen d(ā) up to month m if no adverse event occurs. The corresponding marginal structural model used in our primary analysis was the static regimen model logit Pr(Tām + 1 | Tā > m, V) = α 0(m) + α1cum(ām) + α2V.

We now argue by example that, even if (1) the primary analysis marginal structural model was correct with α 1 > 0 (so increasing cumulative dose increases the risk of breast cancer for static regimens), and (2) the nondynamic counterfactual times to an adverse event and to breast cancer are independently distributed given V, we have no guarantee that our secondary analysis model is correctly specified (even qualitatively). To show this, let ā = c denote the treatment history equal to the constant c at all times so cum(ām) = cm when ā=c. Also note that, by the definition of the regimen d(ā), T ā=0=T d(ā=0). Suppose that, for all subjects, an adverse event will never occur if the daily dose rate is maintained below 1/2, but will occur immediately if a dose of 1/2 or greater is ever taken. Then one can see that, to a close approximation, Tā=0=Td(ā=c), for c ≥ 1/2 (since the cumulative dose actually taken will be essentially zero as the adverse events will dictate no further treatment). In contrast, Td(ā=c)=Tā=c for c < 1/2 because no adverse events will occur. Thus, we conclude that regimen d(ā=c) increases the risk of breast cancer as c increases to 1/2, and then abruptly, beginning at c = 1/2, has no adverse effect on breast cancer. This dose response is a highly nonmonotone function of c and thus of the cumulative dose cm up to any time m; showing that the model used in the secondary analysis is badly misspecified in this example.

Furthermore, 2 different dynamic regimens d(ā(1)) and d(ā(2)) that have essentially the same cumulative exposure in the absence of an adverse event can have very different effects on breast cancer. Specifically, in the context of the previous example, consider the regimens ā(1) and ā(2) that specify a dose of 3/4 every other day and a dose of 3/8 every day, respectively. Then the regimens d(ā(1)) and d(ā(2)) will have essentially identical cumulative exposures in the absence of an adverse event. However, Td(ā(1)) is essentially equal to Tā=0; while Td(ā(2)) is equal to Tā=c for c = 3/8, so the second regimen causes much more breast cancer than the first. We conclude that there is no function of cumulative exposure, even a nonmonotonic one, they can be used to correctly model the discrete hazard of Td(ā). Although, for pedagogic purposes, this example was purposely chosen to be extreme, it shows that we must be very concerned about the specification of the secondary analysis model for the Women's health Initiative.

Back to Top | Article Outline


The Women's Health Initiative: Estrogen plus Progestin Trial is conducted and supported by the NHLBI in collaboration with the WHI Study Investigators. This manuscript was prepared using data obtained from the NHLBI and does not necessarily reflect the opinions or views of the WHI or the NHLBI.

Back to Top | Article Outline


1. Robins JM. Correction for non-compliance in equivalence trials. Stat Med. 1998;17:269–302; discussion 387–389.
2. Robins JM, Hernán MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 2000;11:550–560.
3. Hernán MA, Brumback B, Robins JM. Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. Epidemiology. 2000;11:561–570.
4. Toh S, Hernán MA. Causal inference from longitudinal studies with baseline randomization. Int J Biostat. 2008;4: Article 22. Available at:
5. Robins JM, Finkelstein DM. Correcting for noncompliance and dependent censoring in an AIDS Clinical Trial with inverse probability of censoring weighted (IPCW) log-rank tests. Biometrics. 2000;56:779–788.
6. Robins JM. The analysis of randomized and non-randomized AIDS treatment trials using a new approach to causal inference in longitudinal studies. In: Sechrest L, Freeman H, Mulley A, eds. Health Service Research Methodology: A Focus on AIDS. Washington, DC: NCHSR, US Public Health Service; 1989:113–159.
7. Robins JM. Analytic methods for estimating HIV treatment and cofactor effects. In: Ostrow DG, Kessler R, eds. Methodological Issues of AIDS Mental Health Research. New York: Plenum Publishing; 1993:213–290.
8. Robins JM, Hernán MA. Estimation of the causal effects of time-varying exposures. In: Fitzmaurice G, Davidian M, Verbeke G, et al, eds. Longitudinal Data Analysis. New York: Chapman and Hall/CRC Press; 2008:553–599.
9. The Writing Group for the Women's Health Initiative Investigators. Risks and benefits of estrogen plus progestin in healthy postmenopausal women: principal results from the Women's Health Initiative randomized controlled trial. JAMA. 2002;288:321–333.
10. The Women's Health Initiative Study Group. Design of the Women's Health Initiative clinical trial and observational study. Control Clin Trials. 1998;19:61–109.
11. Curb JD, McTiernan A, Heckbert SR, et al. Outcomes ascertainment and adjudication methods in the Women's Health Initiative. Ann Epidemiol. 2003;13(suppl 9):S122–S128.
12. Chlebowski RT, Hendrix SL, Langer RD, et al. Influence of estrogen plus progestin on breast cancer and mammography in healthy postmenopausal women: the Women's Health Initiative Randomized Trial. JAMA. 2003;289:3243–3253.
13. Anderson GL, Chlebowski RT, Rossouw JE, et al. Prior hormone therapy and breast cancer risk in the Women's Health Initiative randomized trial of estrogen plus progestin. Maturitas. 2006;55:103–115.
14. Thompson WA Jr. On the treatment of grouped observations in life studies. Biometrics. 1977;33:463–470.
15. Cotter D, Zhang Y, Thamer M, Kaufman J, Hernan MA. The effect of epoetin dose on hematocrit. Kidney Int. 2008;73:347–353.
16. Toh S, Hernández-Díaz S, Logan R, Rossouw JE, Hernán MA. Coronary heart disease in postmenopausal recipients of estrogen plus progestin therapy: does the increased risk ever disappear? A randomized trial. Ann Intern Med. 2010;152:211–217.
17. Hernán MA, Robins JM. Estimating causal effects from epidemiological data. J Epidemiol Community Health. 2006;60:578–586.
18. Wasserman L. All of Nonparametric Statistics. New York: Springer; 2006.
19. Hernán MA, Alonso A, Logan R, et al. Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease. Epidemiology. 2008;19:766–779.
20. Hernán MA, Robins JM, García Rodríguez LA. Discussion of “Statistical issues arising in the Women's Health Initiative” by Prentice RL, Pettinger M, Andreson GL. Biometrics. 2005;61:922–930.
21. Cain LE, Cole SR. Inverse probability-of-censoring weights for the correction of time-varying noncompliance in the effect of randomized highly active antiretroviral therapy on incident AIDS or death. Stat Med. 2009;28:1725–1738.
22. Cole SR, Hernán MA. Adjusted survival curves with inverse probability weights. Comput Methods Programs Biomed. 2004;75:45–49.
23. Hernán MA, Lanoy E, Costagliola D, Robins JM. Comparison of dynamic treatment regimes via inverse probability weighting. Basic Clin Pharmacol Toxicol. 2006;98:237–242.
24. Robins J, Orellana L, Rotnitzky A. Estimation and extrapolation of optimal treatment and testing strategies. Stat Med. 2008;27:4678–4721.
25. Chlebowski RT, Anderson G, Pettinger M, et al. Estrogen plus progestin and breast cancer detection by means of mammography and breast biopsy. Arch Intern Med. 2008;168:370–377.
26. Mark SD, Robins JM. A method for the analysis of randomized trials with compliance information: an application to the Multiple Risk Factor Intervention Trial. Control Clin Trials. 1993;14:79–97.
27. Mark SD, Robins JM. Estimating the causal effect of smoking cessation in the presence of confounding factors using a rank preserving structural failure time model. Stat Med. 1993;12:1605–1628.
28. Korhonen PA, Laird NM, Palmgren J. Correcting for non-compliance in randomized trials: an application to the ATBC Study. Stat Med. 1999;18:2879–2897.
29. Joffe MM. Administrative and artificial censoring in censored regression models. Stat Med. 2001;20:2287–2304.
30. Cole SR, Chu H. Effect of acyclovir on herpetic ocular recurrence using a structural nested model. Contemp Clin Trials. 2005;26:300–310.
31. Greenland S, Lanes S, Jara M. Estimating effects from randomized trials with discontinuations: the need for intent-to-treat design and G-estimation. Clin Trials. 2008;5:5–13.
32. Hernán MA, Hernández-Díaz S, Robins JM. A structural approach to selection bias. Epidemiology. 2004;15:615–625.
33. Cole SR, Hernán MA. Constructing inverse probability weights for marginal structural models. Am J Epidemiol. 2008;168:656–664.
© 2010 Lippincott Williams & Wilkins, Inc.