Difference-in-differences (DID) analyses are widely used in a variety of research areas including economics, public policy, and public health.^{1} The approach offers a strategy for estimating the causal effect of a policy, program, intervention, or environmental hazard (hereafter, treatment) on an outcome of interest outside of a randomized trial design. In observational settings, a DID analysis can sometimes be used to obtain unbiased comparisons of outcomes between treatment groups even when those groups are not balanced with respect to unmeasured determinants of the outcome. Specifically, to identify a causal effect in such settings, a DID analysis relies on an assumption that confounding of the treatment effect in the pretreatment period is equivalent to confounding of the treatment effect in the post treatment period. This condition is sometimes referred to as the “parallel trends assumption” and remains a challenging, but necessary, condition for valid inference in DID analyses.^{2}^{,}^{3}

Here, we propose an alternative approach that can yield the identification of causal effects under different identifying conditions than those usually required for DID. Observations in the pretreatment period provide information on covariate-outcome associations in a setting where the treatment is set to 0, that is, to its control value; we use that information to repurpose a measured confounder of the association of interest as a “bespoke” instrumental variable,^{4} yielding a consistent estimator of the treatment effect in the posttreatment period. We focus on a setting in which a DID analysis might typically be undertaken, where outcomes on each study unit have been measured both before and after treatment. While the assumptions necessary for identification of causal effects in observational studies often may not hold perfectly, access to alternative approaches that can yield identification of causal effects under different identifying conditions can help investigators to triangulate evidence and undertake potentially informative comparative sensitivity analyses.

## METHODS

Let *A*(*i*) denote treatment status, where *A*(*i*)=1 if individual *i* is treated, *A*(*i*)=0 otherwise. Let *Y*(*i,t*) be the outcome of interest for individual *i* at time *t*, where a population is observed in two periods: a pre-treatment period, *t*=*t*_{0}; and, a post-treatment period, *t*=*t*_{1}. Let ** Z**(

*i*,

*t*) and

**(**

*U**i*,

*t*) denote measured and unmeasured variables, respectively, that may confound associations between

*A*(

*i*) and

*Y*(

*i*,

*t*).

**may denote a vector of measured variables,**

*Z**Z*

_{1},…,

*Z*

_{P}, and similarly

**may denote a vector of unmeasured variables**

*U**U*

_{1},…,

*U*

_{Q}. For convenience, we use the term “individual” to refer to observed study units; however, the methods discussed apply equally to observations on aggregated units (such as employers, counties, or census tracts).

We define the causal effect of interest in terms of potential outcomes. Let *Y** ^{a}*(

*i*,

*t*) denote individual

*i*’s potential outcome at time

*t*if

*A*(

*i*) were set, possibly contrary to fact, to

*a*. The effect of the treatment on the outcome for individual

*i*at time

*t*is then

*Y*

*(*

^{1}*i*,

*t*) -

*Y*

*(*

^{0}*i*,

*t*). One cannot observe both potential outcomes

*Y*

*(*

^{1}*i*,

*t*) and

*Y*

*(*

^{0}*i*,

*t*) for a given individual

*i*at time

*t*and therefore one cannot compute individual treatment causal effects. Here, focusing on the post-treatment period,

*t*

_{1}, we are interested in estimating an average effect of treatment on the treated (ATT), defined as E[

*Y*

^{1}(

*i*,

*t*

_{1})-

*Y*

^{0}(

*i*,

*t*

_{1})|

*A*(

*i*)=1]. Additional assumptions are needed to identify the average effect of treatment on the total population.

Given our focus on average causal effects, we drop the individual argument *i* to simplify notation. In subsequent discussion we assume the following conditions hold:

- (1) Consistency for the treated,
$Y({t}_{1})={Y}^{a}\left({t}_{1}\right)$ , if*A*=*a*; - (2) Positivity (i.e., a small constant c>0, such that for any z such that Pr(
*Z*=*z*|*A*=1)>c it must be that Pr(*Z*=*z*|*A*=0)>c; and, - (3) No anticipation of future treatment (i.e., at
*t*_{0}individuals do not anticipate the treatment received at*t*_{1}), such that$E[Y({t}_{0})|Z]$ =$E[{Y}^{a}({t}_{0})|Z]$ ), for all*a*.

We first describe a standard difference-in-differences approach to identify the ATT. Then, we describe our proposed generalized difference-in-differences approach to identify the ATT.

### Standard Difference-In-Differences

In a standard DID analysis, among all individuals the pre-treatment outcome is subtracted from the posttreatment outcome. Specifically, the pretreatment outcome among the treated is subtracted from the posttreatment outcome among the treated; in addition, the pretreatment outcome among the untreated is subtracted from the post-treatment outcome among the untreated. The difference of these differences between the treated and the untreated identifies the ATT,

that is typically justified by conditions 1–3 above as well as the parallel trends assumption,

The parallel trends assumption, which we ordinarily require for the DID estimand to identify ATT, implies no unmeasured time-varying confounders (i.e., any factor that causes a trend in the outcome over time is independent of treatment, *A*). The standard DID approach allows that there may be changes in the outcome between the pre- and post-treatment periods for reasons other than treatment. Among the untreated, the outcome may vary from the pre-treatment period to the post-treatment period despite there being no treatment applied. By subtracting the temporal change in the outcome among the untreated, E[*Y*(*t*_{1}) - *Y*(*t*_{0})|*A*=0], from the temporal change in the outcome among the treated, E[*Y*(*t*_{1}) - *Y*(*t*_{0})|*A*=1], the DID estimand accounts for change in the outcome over time that is independent of treatment, *A*.

The standard DID approach also allows that, within each period, there may be confounding of the association between treatment and outcome by measured or unmeasured subject-specific characteristics (Figure A). In the pretreatment period, an association between treatment and outcome may be observed despite there being no causal effect of treatment on outcomes in the pretreatment period; such an association would arise due to confounding. For DID to yield a consistent estimator of the causal effect of treatment on posttreatment outcomes, we ordinarily assume that any confounding bias that occurs in the posttreatment period is exactly equal to the association between the treatment and the pre-treatment outcome mean. Sofer et al. noted that the parallel trends assumption is equivalent to the “additive equi-confounding” assumption described in the literature on negative controls, noting that the pre-exposure outcome is a negative control outcome that cannot be influenced by subsequent exposure.^{5} Differencing post- and pretreatment outcomes induces mean independence between any confounder whose effect on the outcome is constant over time and the outcome, *Y*(*t*_{1})−*Y*(*t*_{0}). Therefore, the standard DID estimand is unconfounded by any factor whose effect on the outcome is constant (given everything else being used for adjustment) but is susceptible to confounding by a factor whose effect on the outcome varies over time.

### Generalized Difference-In-Differences

Suppose that some unmeasured confounder of the association between treatment and pre-treatment outcomes differs in the post-treatment period (Figure B) such that the parallel trends assumption does not hold because there is a time-varying pretreatment cause of the outcome that is associated with treatment. We may also allow that some measured confounders violate the parallel trends assumption. We now introduce an alternative identifying condition of the average effect of treatment on the treated without invoking the usual DID assumption of parallel trends.

To ground ideas, suppose that in addition to treatment and pre- and post-treatment outcomes, one has observed a confounder *Z*. Our proposed generalized DID repurposes a measured confounder, *Z*, of the association of interest as a bespoke instrumental variable,^{4} replacing the standard DID

where *Z* and *ATT*_{GDID} under formal counterfactual assumptions; the GDID approach is justified by conditions formalized below.

A two-stage least squares approach can be used to estimate

*Stage 1*: We first obtain the predicted value of *A* given *Z*,

by fitting a linear regression of *A* on

*Stage 2*: Then, via OLS, we fit a linear regression of

In the eAppendix, https://links.lww.com/EDE/B983, we provide illustrative SAS and R code to implement our proposed approach for the estimation of ATT and associated 95% confidence intervals derived using a recommended robust variance estimator. The estimated parameter,

### Identification

Suppose that *Z* = (*Z*_{1}, *Z*_{2}) and instead of taking all of *Z* as a candidate bespoke instrumental variable, we take *Z*_{1} only as a bespoke instrumental variable, while *Z*_{2} are additional measured covariates that we adjust for.

In this section, we establish formal identification of ATT under conditions 1–3 above, as well as the following conditions:

- (4)
*Z*_{1}is relevant for predicting treatment:$E[A|{Z}_{1},{Z}_{2}]$ depends on${Z}_{1}$ ; - (5) No interaction between
*A*and*Z*_{1}in causing${Y}^{a}({t}_{1})$ conditional on${Z}_{2}\text{}\mathrm{a}\mathrm{n}\mathrm{d}\text{}A=1$ , such that

- (6) The additive association between
*Z*_{1}and pre-treatment outcomes is equal to the additive association between*Z*_{1}and posttreatment outcomes (in the absence of treatment):

*Result:* Under conditions 1–6 we have that for all

Proof of the result is given in eAppendix 1, https://links.lww.com/EDE/B983. We refer to the identifying formula obtained in the result as GDID. Under linear model specifications

in eAppendix 1, https://links.lww.com/EDE/B983, we show that the standard two-stage least squares approach described in the previous section, further adjusted for *Z*_{2} in both stages, obtains a consistent estimator of

Condition 4, which states that *Z*_{1} is associated with treatment, *A*, holds by definition when *Z*_{1} is a confounder of the association of interest. Condition 5 is analogous to a no-interaction assumption routinely made in the instrumental variable setting.^{6}^{,}^{7} This holds by definition under the null hypothesis of no conditional effect of treatment in the treated; it follows that the proposed approach produces a valid test of the sharp causal null hypothesis provided that the remaining assumptions hold. Condition 6 can essentially be interpreted as a *Z*_{1} parallel trend assumption conditional on *Z*_{2}; it states that the additive association between *Z*_{1} and *Y*^{0}(*t*_{0}) is equal to the additive association between *Z*_{1} and *Y*^{0}(*t*_{1}). Standard DID fails if there is an unmeasured time-varying confounder (i.e., a violation of the parallel trends assumption). Condition 6 is a modest alternative identifying condition: it holds if we can identify a measured covariate, *Z*_{1}, whose association with the treatment-free outcome does not vary over times *t*_{0}, *t*_{1}. When condition 6 holds, *Y*^{0}(*t*_{1})−*Y*^{0}(*t*_{0}), by definition, is mean independent of *Z*_{1}; this permits our bespoke instrumental variable^{4} approach to the estimation of ATT.

An intuition for our identification result (suppressing *Z*_{2} here) may follow by noting that the change in observed outcomes between *t*_{0} and *t*_{1} conditional on *Z*_{1} = *z*_{1} is the result of (1) the change in untreated potential outcomes between *t*_{0} and t_{1}; (2) the causal effect of the treatment at *t*_{1} on the treated; and, (3) the proportion of treated units, P(*A* = 1|*Z*_{1} = *z*_{1}). Under our identifying conditions, the change in untreated potential outcomes conditional on *Z*_{1} = *z*_{1} is equal to the change in untreated potential outcomes conditional on *Z*_{1} = 0. Therefore, to recover the causal effect of the treatment at *t*_{1} for treated units at *Z*_{1} = *z*_{1} we just need to subtract off the change in observed outcomes among untreated units conditional *Z*_{1} = 0 and then account for the proportion of treated units, P(*A* = 1|*Z*_{1} = *z*_{1}).

### Simulation

We simulated data for 1,000 studies, with 5,000 people in each study sample, with people observed in pre- (*t*=0) and posttreatment (*t*=1) periods. We generated simulations for two scenarios. In the first scenario, which conformed to the “parallel trends” assumption (Figure A), we generated a measured covariate, denoted *Z*_{1}, and an unmeasured covariate denoted *U*_{1}. *Z*_{1} and *U*_{1} were random binary variables, both taking values of 1 with a probability of 0.5. We assigned *A* as a random binary variable that took a value of 1 with probability 1/(1+exp(-(-0.1 -0.5×*U*_{1} + *Z*_{1}))). We considered the case when Z_{1} is strongly (*A.* The pre-treatment outcome variable, *Y*(*t*=0), took a value of (1 + 1×*U*_{1} +1×*Z*_{1} +*ε*), where *ε*~*N*(0,1); and, the post-treatment outcome variable, *Y*(*t*=1), took a value of (1 + 1×*U*_{1} +1×*Z*_{1} +1×*A*+*ε*). In the second scenario, which violated the parallel trends assumption (Figure B), we generated an additional covariate, denoted *U*_{2}, that was a continuous variable assigned by sampling from a normal (0,1) distribution. We assigned *A* as a random binary variable that took a value of 1 with probability 1/(1+exp(−(−0.1 −0.5×*U*_{1} −0.5×*U*_{2} + *Z*_{1}))). The pretreatment outcome, *Y*(*t*=0), took a value of (1 + 1×*U*_{1} +1×*U*_{2} +1×*Z*_{1} +*ε*); and, the posttreatment outcome, *Y*(*t*=1), took a value of (1 + 1×*U*_{1} +1×*Z*_{1} +1×*A*+*ε*).

We used the GDID method described in this paper to obtain an estimate of the average change in *Y*(*t*=1) with *A* by a two-stage regression. In all simulations, in the first-stage model *A* in linear regression; then, in the second stage, we fitted a linear regression model for *Y*(*t*=1)−*Y*(*t*=0) as a function of the predicted value of *A* given ^{8} we use the rule of thumb that the first-stage F-statistic needs to be larger than 10 for the usual asymptotic inference to be reliable.^{9} For comparison, we fitted a difference-in-differences model; a linear regression model was fitted to each simulated cohort for *Y*(*t*=1)−*Y*(*t*=0) as a function of *A*. We summarized results from the simulated studies by computing the Monte Carlo mean and Monte Carlo standard deviation (SD) of the estimates, square root of the mean of squared difference between the estimated associations, and the specified true effect of *A* on *Y* (the root mean squared error, RMSE), average of standard errors (SEs), and coverage probability (CP) of 95% confidence intervals from normal approximation. We also reported the Monte Carlo mean of the first-stage F-statistic. The eAppendix, https://links.lww.com/EDE/B983, reports results of additional simulations in which: i) *U*_{1} was associated with *Z*_{1}; and, ii) *U*_{2} was associated with *Z*_{1} (i.e., violating the condition of *Z*_{1} additive equi-confounding). R code for the GDID method and for reproducing numerical examples can be found in the supplementary material, https://links.lww.com/EDE/B983.

### Empirical example 1: Card and Krueger

This empirical example is based on Card and Krueger’s landmark study^{10} of the impact on employment of an increase in minimum wage in New Jersey (NJ). In early 1990 the NJ legislature increased the state minimum wage to $5.05 per hour effective April 1, 1992. Card and Krueger surveyed fast-food restaurants (Burger King, KFC, Wendy’s, and Roy Rogers chains) in NJ and eastern Pennsylvania before the increase in the minimum wage (February 15–March 4, 1992, denoted *t*=0) and after the increase in the minimum wage (November 5–December 31, 1992, denoted *t*=1). The survey was conducted primarily by telephone and included questions on employment, starting wages, the price of a full meal (medium drink, small fries, and an entree), and other store characteristics. The outcome variable of primary interest, denoted *Y*(*t*), is employment per store measured in full-time-equivalent and calculated as the number of full-time workers (including managers) plus 0.5 times the number of part-time workers. The treatment variable of primary interest states, coded 1 for NJ, else 0. First, we fitted a standard difference-in-differences model; a linear regression model was fitted for *Y*(*t*=1)−*Y*(*t*=0) as a function of *state*. Next, we fitted our proposed GDID. The “bespoke IV” was price of a full meal. The variable was chosen both because it was relevant for predicting treatment (i.e., *state*), and because it is reasonable to posit that the association between full meal price and employment levels before the increase in the NJ minimum wage is likely to be approximately equal to the association between full meal price and employment levels in the posttreatment period (had the NJ minimum wage not changed), as the two associations are ascertained within a relatively short period (approximately 9 months). In a first-stage model, indicators for quintiles of the price of a full meal were used to predict *state* (NJ=1, else 0) in a linear regression; we note that the price of a full meal is associated with *state* (being higher in NJ than PA), and changed little between pre-and post-increase in minimum wage periods (i.e., the average reported price of a full meal differed by 1 cent in PA over the survey periods, and differed by 6 cents in NJ over the survey periods). In the second stage, a linear regression model was fitted for *Y*(*t*=1)−*Y*(*t*=0) as a function of the predicted value of *state* given the price of a full meal.

### Empirical example 2: Health Insurance Subsidy Program

We use the Health Insurance Subsidy Program (HISP), a case example modeled after real-world example data of impact evaluations that were developed by the World Bank.^{11} One of the primary objectives of HISP is to reduce the burden of health-related out-of-pocket expenditures for low-income households. The data are at the level of household and period and include a baseline (*t*=0) and follow-up survey (*t*=1). The outcome variable of primary interest denoted *Y*(*t*), is out-of-pocket health expenditure (per capita per year); the intervention of primary interest is a binary indicator of whether the household enrolled in HISP (0=no, 1=yes). Here we restrict to data from localities where the program has been offered. First, a standard DID estimate is obtained by linear regression in which Y(*t*=1)−*Y*(*t*=0) is regressed on HISP. However, one might question whether the parallel trends assumption required for standard DID holds. Next, we fitted our proposed GDID. The bespoke IV was age of the head of the household (in years). The variable was chosen both because it was relevant for predicting treatment (i.e., whether the household enrolled in HISP), and because it is plausible that the association between age of head of household and health expenditure before enrolling in HISP is equal to the association between age of head of household and health expenditure in the posttreatment period (in the absence of HISP). In a first-stage model, indicators for quintiles of age of the head of the household were used to predict HISP in linear regression. In the second stage, a linear regression model was fitted for *Y*(*t*=1)−*Y*(*t*=0) as a function of the predicted value of HISP.

## RESULTS

### Simulation Example

Under the first simulation scenario, which conformed to the parallel trends assumption, our GDID two-stage regression estimator of the exposure–outcome association suffered no bias due to confounding by *U*, even though *U* is an unmeasured variable. Similarly, when using a difference-in-differences approach, the estimator of the exposure–outcome association suffered no bias due to confounding by *U*. The statistical efficiency of the GDID estimator diminished with diminishing magnitude of association between

Under the second simulation scenario, which violated the “parallel trends” assumption, the GDID estimator of the exposure–outcome association suffered no bias. In contrast, the DID estimator suffered bias; in addition, the estimated RMSE was larger for the standard DID estimator than for the GDID estimator when the association between

The eTable, https://links.lww.com/EDE/B983, in eAppendix 3, https://links.lww.com/EDE/B983, provides results of additional simulation scenarios. Simulation A1 conforms to the “parallel trends” assumption except that *U*_{1} affects *Z*_{1}. Neither the standard DID nor the GDID estimator was biased. Simulation A2 violates the parallel trends assumption, *U*_{1} affects *Z*_{1}, and we included an additional covariate *U*_{2}. The standard DID estimator was biased while GDID suffered no bias. Simulation A3 violates the parallel trends assumption and *U*_{2} affects *Z*_{1} (violating the condition of *Z*_{1} additive equi-confounding). The DID estimator suffered bias and the GDID estimator suffered bias. In simulations A1 and A3, and in simulation A2 when the association between *Z*_{1} and *A* was weak, the estimated RMSE was smaller for the DID estimator than for the GDID estimator. In contrast, in simulation A2 when the association between *Z*_{1} and *A* is strong or moderate, the estimated RMSE was larger for the DID estimator than for GDID.

### Empirical Example 1

In the pre-period, average employment was 23.33 full-time equivalent workers per store in Pennsylvania and 20.4 full-time equivalent workers per store in New Jersey. In the post-period, full-time equivalent employment increased in New Jersey relative to Pennsylvania. A standard DID estimator yields an estimate of the relative gain in employment of 2.33 (s.e.=1.19) full-time equivalent employees. Our proposed GDID yielded a slightly larger magnitude of estimate of the relative gain in employment albeit with less precision (estimate=3.12, s.e.=3.69), with the first-stage F-statistic being 10.29.

### Empirical Example 2

Estimates of mean household health expenditures (in dollars) for enrolled households before HISP was 14.49 and after introduction of HISP was 7.84. Estimates of mean household health expenditures for non-enrolled households before HISP was 20.79 and after introduction of HISP was 22.30. A standard difference-in-differences estimate of the causal effect of HISP on household health expenditures was −8.16 (s.e.=0.32). The GDID approach yielded an estimate of −8.50 (s.e.=0.77), with the first-stage F-statistic being 168.13.

## DISCUSSION

We propose a novel approach to the analysis of data that conform to the DID design. A standard DID estimator allows identification of the average causal effect of treatment on the treated under the parallel trends assumption (Figure A). However, it may yield a biased estimate of the effect of treatment on the treated if an unmeasured confounder varies between pre- and posttreatment periods (Figure B). The GDID approach allows identification of the average causal effect of treatment on the treated under the causal structures illustrated in both parts of the Figure.

We exchange the usual DID parallel trends assumption for a different set of assumptions. Specifically, the GDID approach requires assumptions about a measured covariate, *Z*_{1}, that we can select (from among those measured variables): *Z*_{1} predicts treatment, *A*; no statistical interaction between *Z*_{1} and *A* in causing *Y*(*t*_{1}); in addition, the additive association of *Z*_{1} with *Y*^{0}(*t*_{0}) and *Y*^{0}(*t*_{1}) is equivalent. This set of identifying conditions may be an appealing alternative to the usual DID parallel trends assumption. Several prior publications have described methods for researchers to draw causal inferences if parallel trends is violated, including an inverse probability weighting method for DID,^{12} an outcome regression modeling approach,^{13} and a doubly robust approach.^{14} Those methods accommodate violations of parallel trends by modeling measured covariates, so that parallel trends are assumed to hold upon conditioning on measured covariates. Other proposed methods relax the parallel trends assumption by placing bounds on how much unmeasured confounders may affect the untreated potential outcomes.^{15}^{,}^{16} In contrast, our approach further accommodates violations of the parallel trends assumption by unmeasured covariates. Our proposed GDID method is a new alternative in this broader suite of methods that seek to relax the parallel trends assumption.

While we have described the approach, and illustrated it, with examples where we identify a single covariate, *Z*_{1}, whose effect on Y^{0}(*t*) is invariant over *t*=0,1, the GDID approach readily extends to incorporating several covariates; leveraging several covariates may be appealing because it may offer a way to strengthen their relevance for predicting treatment. Our empirical examples leverage publicly available data sets to illustrate the method; we recognize that the assumptions necessary for the identification of causal effects in these empirical examples may not hold perfectly (e.g., a condition may be violated for a provided all BSIV conditions hold selected variable, *Z*_{1}, employed in a specific empirical setting), but this serves to underscore the value of sensitivity analyses and access to methods that leverage alternative identifying conditions. Similarly, the approach also may be extended to explicitly model the effects of other measured covariates *Z*_{2} in the second-stage regression. Although our primary focus is on analysis of nonexperimental data, the GDID method also may have utility for analyses of data derived from experimental designs, such A/B tests, where a classical DID analysis is sometimes used to adjust for pre-test differences in covariates between the groups under comparison (i.e., unanticipated imbalance across treatment groups in covariates).

Connections between instrumental variables and DID have been discussed by prior authors. Ye (2021) discusses an “instrumented” DID for example, which leverages exogenous random variation in treatment within a standard DID framework.^{17} The proposed GDID has some similarities to the instrumented DID proposed by Ye et al. (2021), but also has notable differences. First, the GDID considers the same setting as the standard DID, where data are observed for treated and control units before and after the treated units adopt the treatment while the control units are never treated. The GDID also considers the same parameter of interest as the standard DID, that is the average treatment effect for the treated in the post-treatment period. In contrast, the instrumented DID considers the setting where units can be treated or untreated at either time point and the average treatment effect as the parameter of interest. Second, the GDID assumes that *Z*_{1} does not modify the average treatment for the treated in the post-treatment period, while the instrumented DID assumes that the instrument variable for DID is independent of the treatment effect and the treatment effect is time-invariant. The “*Z*_{1} parallel trends assumption” made by the GDID is essentially shared by the instrumented DID. Interestingly, the identification formulas of these two methods are the same when *Z*_{1} is binary and no units are treated in the pretreatment period, which provides the identification formula with two different interpretations; this is analogous to the familiar observation that the standard Wald ratio estimator used in a standard IV analysis can be interpreted as the average treatment effect under a no unmeasured common effect modifier assumption^{18} and can be interpreted as the average treatment effect for the treated under a no current value interaction assumption.^{7}

As our simulations illustrated, when the parallel trends assumption holds, the standard DID approach is more statistically efficient than our proposed two-stage least squares estimator of the GDID identifying estimand, although when the association between *Z*_{1} and *A* is moderate or strong, our proposed estimator has RMSE close to that of the standard DID. Usefully, nested within the GDID model is a reduced model that implies parallel trends; a constraint can be imposed on *A* and *Z*_{1} in causing *Z*_{1} and *A* is moderate or strong, our proposed approach will have a smaller RMSE than the standard DID. While the GDID has a cost in terms of statistical precision reflecting the different identifying conditions and bespoke IV estimation approach, it will allow for avoiding bias in certain settings of unmeasured time-varying confounding. Of course, if the cost in terms of precision is high, the root means square error of the GDID approach may exceed that of the standard DID (as illustrated in some simulations); this is particularly true when the parallel trends assumption holds (or nearly holds).

The GDID approach has the potential to be used in routine policy evaluation across many disciplines, as it essentially combines two popular quasiexperimental designs, leveraging their strengths while relaxing their usual assumptions (which also provides overidentification specification tests). The GDID approach can be used with any available measured confounder without requiring unmeasured confounding, and the stronger the association of the measured confounder with the treatment, the stronger the resulting instrument. The GDID approach will tend to sacrifice some statistical efficiency to reduce potential bias due to time-varying unmeasured confounders. In non-experimental studies, this may often be a desirable trade-off.

*A*, Measured Covariate,

*Z*, Unmeasured Covariate,

*U*, and Outcome,

*Y*

Scenario | Mean | SD | RMSE | SE | CP |
---|---|---|---|---|---|

Scenario 1 (Conforms to “parallel trends”)Strong bespoke IV (mean F-statistic 1171.8) |
|||||

GDID method Standard DID method |
1.00 1.00 |
0.09 0.04 |
0.09 0.04 |
0.10 0.04 |
95.2 96.1 |

Moderate bespoke IV (mean F-statistic 308.5) |
|||||

GDID method | 1.00 | 0.16 | 0.16 | 0.17 | 95.3 |

Standard DID method | 1.00 | 0.04 | 0.04 | 0.04 | 96.1 |

Weak bespoke IV (mean F-statistic 49.7) |
|||||

GDID method | 1.00 | 0.42 | 0.42 | 0.42 | 96.0 |

Standard DID method | 1.00 | 0.04 | 0.04 | 0.04 | 94.5 |

Scenario 2 (Violates “parallel trends”)Strong bespoke IV (Mean F-statistic 1067.2) |
|||||

GDID method Standard DID method |
1.00 1.39 |
0.12 0.05 |
0.12 0.39 |
0.12 0.05 |
95.7 0.0 |

Moderate bespoke IV (mean F-statistic 277.6) | |||||

GDID method | 1.01 | 0.21 | 0.21 | 0.22 | 95.7 |

Standard DID method | 1.44 | 0.05 | 0.44 | 0.05 | 0.0 |

Weak bespoke IV (Mean F-statistic 44.0) |
|||||

GDID method | 1.01 | 0.55 | 0.55 | 0.55 | 96.5 |

Standard DID method | 1.46 | 0.05 | 0.47 | 0.05 | 0.0 |

## ACKNOWLEDGMENTS

The authors thank Sander Greenland for his helpful comments on a draft of this manuscript.

## REFERENCES

**Keywords:**

regression analysis; cohort studies; instrumental variables; unmeasured confounding