Share this article on:

Commentary: Structural Nested Models, G-Estimation, and the Healthy Worker Effect The Promise (Mostly Unrealized) and the Pitfalls

Joffe, Marshall M.

doi: 10.1097/EDE.0b013e318245f798

From the Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine, Philadelphia, PA.

Supported by National Institutes of Diabetes and Digestive and Kidney Diseases, National Institutes of Health (5R01DK90385). The authors reported no financial interests related to this research.

Correspondence: Marshall M. Joffe, Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine, 602 Blockley Hall, 423 Guardian Dr, Philadelphia, PA 19104. E-mail:

In this issue of EPIDEMIOLOGY, Chevrier et al1 apply G-estimation to estimate the effect of exposure to metalworking fluids when a healthy-worker survivor effect is present. This is the first application of G-estimation to this problem in the more than 20 years since G-estimation was first proposed.2 Like G-computation (or the G-formula), G-estimation has been much less popular in practice than the more recently proposed marginal structural models. In this commentary, I discuss possible reasons for the relative unpopularity of G-estimation in practice, as well as some little recognized advantages of the approach compared with competing methods.

Back to Top | Article Outline

Pitfalls of G-estimation: Unfamiliarity and Artificial Censoring

G-estimation is a semiparametric way to estimate parameters in structural nested models. These models include structural nested mean models,3,4 structural nested distribution models,4 structural nested failure time models,2,5,6 and the newly developed structural nested cumulative failure time models.7 These models differ from the more familiar standard regression models and marginal structural models in important ways. In the more familiar models, covariates may be used in the structural model as main (statistical) effects themselves and also as modifiers of the causal effects of the treatments of interest. Only baseline covariates may be included in the models if one wishes to retain causal interpretation for the joint effects of treatments provided at different times. In contrast, structural nested models directly parameterize only the effect of a possibly sequential series of interventions on the outcome of interest. Thus, nontreatment covariates enter the structural model only as modifiers of the effect of the treatment, not as main effects. Additionally, time-varying covariates that modify the effect of subsequent treatments may enter the model as effect modifiers.

For outcomes measured at the end of a fixed period or for repeated measures outcomes, the form of these models can resemble those of standard regression models. However, structural nested failure time models do not directly estimate hazard ratios or other relative risk parameters that are ubiquitous in epidemiology and applied survival analysis. The accelerated failure time model with time-varying covariates is essentially the only structural nested failure time model that has been used in practice. Conversion between the parameters in the accelerated failure time model used most often in conjunction with G-estimation and parameters in the proportional hazards model may be done in 1 of the 2 ways. If the treatment-free survival times follow a Weibull distribution, there is a direct correspondence between the parameters in the accelerated failure time model and the hazard ratio.6 The Weibull model is restrictive; for example, it cannot accommodate hazard or rates that start at a nonzero value at the beginning of follow-up and increase over time. When failure times do not follow a Weibull model, there will be a separate causal hazard ratio at each follow-up time. One can then characterize that hazard ratio as a function of time, which is unwieldy and difficult to summarize, or estimate the hazard ratio at one or a few specific times, as done by Chevrier et al.1 This approach is of greatest value if the hazard ratio does not vary too much over time. It is not clear in the paper how representative the reported hazard ratio is of the set of hazard ratios over the time of follow-up, and it would be useful to report a hazard ratio at several times if the hazard ratio varies substantially with time.

G-estimation of structural nested failure time models suffers further from the consequences of artificial censoring.8 To test a hypothesis about causal effects without making parametric assumptions beyond those in the structural nested failure time models, some subjects whose failure is observed must be treated as if they were censored; subjects artificially censored for testing one hypothesis about causal effects may not be censored for testing other hypotheses.5,6 This results in estimating functions that are not smooth functions of the parameters being tested or estimated. This has several consequences. First, common algorithms (eg, Newton-Raphson) for finding solutions to the estimating functions may not work. Second, estimates of the variance of the estimator become problematic in practice. The problems with artificial censoring may be even deeper when, as is nearly inevitable, the models for causal effects are misspecified because the optimization criterion may no longer be appropriate.

The first 2 problems are not, in principle, serious limitations in attempting to estimate a causal model with a single parameter. One can use line search methods to obtain point estimates and confidence intervals; an implementation of this is available in Stata.9 However, if one is trying to estimate a complex causal model where the effect of treatment may be modified by covariates and prior treatment or where there are dose-response functions for nonbinary treatments, grid searches quickly become infeasible. Sometimes, one can obtain estimates for individual parameters without a high-dimensional grid search. In particular, if there are separate parameters for the effects of earlier and later interventions given in sequence, one can estimate the effect of the parameters in reverse chronological sequence.10 The problems with high-dimensional optimization can be mitigated to some degree by using estimating equations that are smoother functions of the parameters being estimated—but this cannot always be relied on.8 In the application presented by Chevrier et al,1 only one parameter is estimated. This will often be a reasonable approximation to the truth in occupational health, where an exposure may be uniformly harmful and the effects may not vary too much with measured covariates. In contrast, when sorting out complex questions about clinical management of patients, where the treatment may be beneficial or harmful in different subgroups and dose-response functions are nonlinear, such 1-parameter models may not answer useful clinical questions. G-estimation may not be useful in this setting. Artificial censoring is not necessary for G-estimation in the other classes of structural nested models; thus, the resulting problems do not complicate estimation of those models. G-estimation of structural nested cumulative failure time models may thus be able to substitute for noncumulative models in some settings with failure times as outcomes.7 Alternatively, parametric methods, despite their pitfalls, might be used to estimate the noncumulative models. Some SAS macros are available for G-estimation of structural nested distribution models in the context of mediation analyses.11

Back to Top | Article Outline

Advantages of G-estimation: Flexibility in Modeling and Assumptions

The authors justify the use of G-estimation by comparing the approach with variants of standard methods, and cite confounding by variables affected by treatment as the reason for using G-estimation. Further, they demonstrate nicely that appropriate methods give different and more reasonable estimates in their application of those methods. As the first applied paper on the healthy-worker survivor effect, this demonstration is appropriate. In general, after many papers on the subject of time-varying confounders, lengthy justification for using appropriate estimation approaches in this setting should no longer be necessary in the epidemiology literature (although it may still be necessary for papers in various clinical specialties).

Marginal structural models have become the de facto standard for dealing with time-varying confounding. An explanation of reasons for the use of G-estimation instead of marginal structural models would often be worthwhile. The authors have not provided such explanation. In occupational health settings like those considered here, estimation of standard marginal structural models typically runs afoul of the required positivity assumption because people who leave work have no subsequent exposure12; thus, G-estimation or G-computation is preferable here. General marginal structural models7 might also be used productively here; for example, one might use them to contrast outcomes under regimes of the type: “continue to be exposed for t years unless one leaves work.”

Structural nested models and G-estimation do have some important but sometimes little recognized advantages over marginal structural models and G-computation. (Further discussion of the drawbacks of structural nested models and G-estimation is available elsewhere.7,13) Robins and Hernan7 note that structural nested models allow one to model the modification of the effects of treatments provided after baseline by using postbaseline time-varying covariates. (History-adjusted marginal structural models somewhat mitigate this restriction of marginal structural models even though they are a modification of marginal structural models.) G-estimation of structural nested models is not adversely affected by the large weights that can characterize estimation of marginal structural models, especially when there is an extended period of follow-up. G-estimation may be used when the treatment assignment is confounded when: (1) there is a baseline instrument (eg, randomization in a randomized trial),3,14,15 (2) there is a time-varying instrument,7 or (3) sufficient confounders are measured to render treatment assignment unconfounded in an identifiable subset of the person-time experience in a study, but not in other subsets.16 18 Thus, the assumptions used in estimation of structural nested models can be tailored to fit the specific conditions of the study without having to resort to speculation about the degree of departure of treatment assignment from exchangeability or ignorability, which is required with marginal structural models but difficult in practice. G-estimation may be used to reduce sensitivity of inference to error in measurement of treatment in identifiable subsets of the data.18

Finally, structural nested distribution models and a variant of G-estimation can sometimes be used to deal with situations in which sufficient information on confounders is lacking in all subsets of the data.19 When there are repeated measures of the outcome, future outcomes may sometimes serve as a proxy for the confounding variables used in making treatment decisions but recorded incompletely at best. This issue may arise when treatment decisions are made more frequently than their determinants are recorded for research—a common situation in epidemiology.

In conclusion, structural nested models and G-estimation have substantial promise that, so far, has remained incompletely realized in practice. Recent work has found additional advantages in theory to the approach. Nonetheless, there is a need for good software to make the method more acceptable, and to provide convincing application of the methods in practical problems, especially where other approaches fail. The paper by Chevrier et al is a useful step in this direction.

Back to Top | Article Outline


1. Chevrier J, Picciotto S, Eisen EA. The healthy-worker survivor effect in autoworkers exposed to metalworking fluids: a comparison of standard methods with g-estimation of accelerated failure-time models. Epidemiology. 2012;23:212–219.
2. Robins JM. The analysis of randomized and non-randomized AIDS treatment trials using a new approach to causal inference in longitudinal studies. In: Sechrest LA, ed. Health Service Research Methodology: A Focus on AIDS. Rockville, MD: NCHSR, US Public Health Service; 1989:113–159.
3. Robins JM. Correcting for non-compliance in randomized trials using structural nested mean models. Commun Stat Theory Methods. 1994;23:2379–2412.
4. Robins JM, Rotnitzky A, Scharfstein DO. Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In: Halloran E, Berry D, eds. Statistical Models in Epidemiology. New York: Springer-Verlag; 2000:1–99.
5. Robins J. Estimation of the time-dependent accelerated failure time model in the presence of confounding factors. Biometrika. 1992;79:321–334.
6. Robins JM, Blevins D, Ritter G, Wulfsohn M. G-estimation of the effect of prophylaxis therapy for pneumocystic carinii pneumonia on the survival of AIDS patients. Epidemiology. 1992;3:319–336.
7. Robins JM, Hernan MA. Estimation of the causal effects of time-varying exposures. In: Fitzmaurice G, Davidian M, Verbeke G, Molenberghs G, eds. Longitudinal Data Analysis. Boca Raton, FL: CRC Press; 2009:553–599.
8. Joffe MM, Yang WP, Feldman HI. G-estimation and artificial censoring: problems, challenges, and applications. Biometrics. 2011 . doi: 10.1111/j.1541-0420.2011.01656.x. [Epub ahead of print.]
9. Sterne JA, Tilling K. G-estimation of causal effects, allowing for time-varying confounding. Stata J. 2002;2:164–182.
10. Robins JM, Greenland S. Adjusting for differential rates of prophylaxis therapy for PCP in high- versus low-dose AZT treatment arms in an AIDS randomized trial. J Am Stat Assoc. 1994;89:737–749.
11. Ten Have TR, Joffe MM, Lynch KG, Brown GK, Maisto SA, Beck AT. Causal mediation analyses with rank preserving models. Biometrics. 2007;63:926–934.
12. Robins JM, Hernan MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 2000;11:550–560.
13. Robins JM. Marginal structural models versus structural nested models as tools for causal inference. In: Halloran ME, Berry D, eds. Statistical Models in Epidemiology: The Environment and Clinical Trials. New York: Springer-Verlag; 1999:95–134.
14. Mark SD, Robins JM. A method for the analysis of randomized trials with compliance information: an application to the multiple risk factor intervention trial. Control Clin Trials. 1993;14:79–97.
15. Robins JM, Tsiatis AA. Correcting for non-compliance in randomized trials using rank preserving structural failure time models. Commun Stat Theory Methods. 1991;20:2609–2631.
16. Joffe MM, Hoover DR, Jacobson LP, et al.. Estimating the effect of zidovudine on Kaposi's sarcoma from observational data using a rank preserving structural failure-time model. Stat Med. 1998;17:1073–1102.
17. Robins JM. Causal models for estimating the effects of weight gain on mortality. Int J Obesity. 2008;32:S15–S41.
18. Joffe MM, Yang WP, Feldman HI. Selective ignorability assumptions in causal inference. Int J Biostat. 2010;6:11.
19. Zhang M, Joffe MM, Small DS. Causal inference in continuous process with covariates observed at discrete times. Ann Statist. 2010;39:131–173.
Back to Top | Article Outline


MARSHALL JOFFE is Associate Professor of Biostatistics at the University of Pennsylvania. His current research concentrates on inference for the effects of time-varying treatments when the usual assumptions justifying causal inference do not apply. He has applied this work in nephrology and other fields.

© 2012 Lippincott Williams & Wilkins, Inc.