When estimating the effect of an exposure on an outcome in the time-varying setting, epidemiologists routinely target the average causal effect, which compares counterfactual outcomes had one intervened to expose all versus none of a sample across all time points. However, there is a growing recognition that the average causal effect is an unrealistic contrast because, even in time-fixed settings, it is difficult to imagine an intervention that would result in all individuals in a population being exposed or unexposed.^{1} The interventions implied by the average causal effect are even more challenging to imagine in the time-varying setting, where we have to assume we can set all individuals to being exposed at all time points. Furthermore, the longer we follow participants, the more unrealistic such intervention becomes.

Fortunately, there exist alternative causal estimands to the average causal effect. One such estimand—the incremental effect—allows us to estimate the effect of shifting each individual’s probability of being exposed, instead of intervening on the exact, fixed value of the exposure as in the average causal effect.^{2–4} Estimation of incremental effects has several advantages. First, depending on the exposure of interest and target population, this approach may better reflect the impact of a realistic public health intervention. For example, realistic interventions to increase patient adherence to medication (e.g., daily cellphone notifications) might on average increase patients’ likelihood of taking their medication but are unlikely to result in all patients always adhering. Interventions that would lead to perfect adherence (e.g., daily nurse visits) are likely to be costly or unethical. Exposure interventions are also not typically applied uniformly in populations, so a fixed intervention like that imagined by the average causal effect may not be of practical policy interest.^{3} Second, incremental effects can be estimated using double-robust methods, which enables balancing the tradeoff between bias from the curse of dimensionality and bias from potential statistical model misspecification. Specifically, the approach we use here can achieve optimal statistical properties (root-n convergence) regardless of the number of timepoints—even when implemented with flexible machine learning tools.^{4}^{,}^{5} Third, identification of incremental effects does not require meeting the positivity assumption, which makes this an attractive estimator in settings where either structural or random violations of positivity may be likely.^{6}

Prior work has described the theory and motivation behind the estimation of incremental effects.^{2}^{,}^{4} Naimi et al.^{3} demonstrated how to estimate incremental effects in time-fixed settings, using as an example the effect of increasing vegetable intake on the risk of preeclampsia. Here, we demonstrate how to estimate these effects in longitudinal data with a time-varying exposure, time-varying confounding, and drop-out. We build on the applied example in Kim et al.,^{4} by describing in depth how to estimate the effect of taking preconception, low-dose aspirin on the incidence of pregnancy in the Effects of Aspirin in Gestation and Reproduction (EAGeR) trial. This work was motivated by the challenge of analyzing the effects of a time-varying exposure (adherence to aspirin) that suffered from nonpositivity as follow-up accrued.

## METHODS

### Study Sample

The EAGeR trial was a double-blind trial, designed to investigate whether taking preconception low-dose aspirin had an impact on pregnancy outcomes.^{7}^{,}^{8} The study enrolled 1228 women at high risk for pregnancy loss and randomized women 1:1 to receive 81 mg of aspirin or placebo; all women additionally received 400 mcg of folic acid. Participants were followed up for 6 menstrual cycles if they did not become pregnant and, if they did become pregnant, throughout pregnancy. Women were allowed to leave the study at any point during follow-up. The trial’s primary outcome was live birth, with additional outcomes of interest including pregnancy and preterm birth. EAGeR participants provided written informed consent to participate in the trial. Our secondary analysis fell under the approval of the Institutional Review Board of the University of Pittsburgh, who deemed the work not human subjects’ research.

Here, we focus on the incidence of pregnancy by 26 weeks of follow-up (approximately six menstrual cycles). Pregnancy was determined by either a positive result on a “real-time” urine pregnancy test carried out at home or at a study visit or from urine testing conducted on stored samples after study completion.^{7} During this time period, 116 (9.5%) of the 1226 women included in our analysis dropped out of the study (two women enrolled in the study were excluded due to all missing data).

The intention-to-treat analysis of the EAGeR trial reported a small increase in rates of pregnancy among those assigned to aspirin relative to placebo.^{8} However, there was noted noncompliance to assigned treatment that increased as follow-up accrued (Figure 1A). Thus, a per-protocol analysis was conducted that assessed the effect on pregnancy outcomes of being assigned to aspirin and complying in each week of follow-up versus being assigned to placebo and always complying.^{9} Compliance to randomized treatment was determined based on bottle weight measurements and was defined as taking an assigned pill 5 out of 7 days in a given week. This per-protocol analysis reported a small increase in the incidence of pregnancy among those compliant with aspirin, relative to those compliant with placebo; specifically, the estimated risk difference was 7.8% (95% CI = 4.6%, 11%).

### Defining the Causal Effect of Interest

We often define the per-protocol effect as the effect of assigning everyone to a given treatment and intervening to ensure they always comply with a specified protocol versus assigning everyone to a comparator treatment and intervening to ensure they always comply:^{10}

where ^{11}^{,}^{12}

However, suppose we thought it unrealistic to model an intervention that would force all women to comply in all weeks of follow-up, but we thought we could instead model an intervention that would increase women’s probability of complying. If this were the case, we might be interested in targeting an incremental effect, rather than the average causal effect. Incremental effects in the longitudinal setting have been described in detail elsewhere; here, we provide an overview of the method in the application.^{4}

In the EAGeR per-protocol analysis, we could estimate the risk of the outcome through a certain time

where

To shift

Rewriting this equation, we can see that

We are then interested in estimating what the average counterfactual outcomes (i.e., counterfactual risk of the outcome) would be under exposure to the shifted propensity score

where

The double-robust estimator for longitudinal incremental effects combines information on the shifted propensity score

For simplicity, we have removed the subscript

Although different in form, ^{4}

### Identifying incremental effects

To interpret an estimate obtained in the observed data as the targeted causal effect, meeting certain identification conditions is required. For the standard g-methods, one sufficient set of conditions includes exchangeability, causal consistency, and positivity.^{13} One complication of estimating average causal effects in data with time-varying exposures and long follow-up periods, though, is that violations of the positivity condition are common, particularly random violations due to data sparsity.^{4}^{,}^{6}

In the time-varying setting, positivity requires a nonzero probability of following a given treatment regime (conditional on those variables necessary to achieve exchangeability) across all follow-ups. For example, when estimating the per-protocol effect in EAGeR, we have to assume that the cumulative probability of remaining compliant with assigned treatment (conditional on the baseline and time-varying confounders) in each successive week of follow-up is bounded away from zero (and one). Specifically, we define the positivity condition for the per-protocol effect as:

When we estimated these probabilities using logistic regression, we saw that these probabilities approached zero as follow-up accrued, indicating a random positivity violation (Figure 1B).^{6}

When positivity is violated, several solutions can be pursued.^{6}^{,}^{14} If the violations are random, one can use parametric models to smooth over the sparse data, at the cost of strong model specification assumptions. This was the solution pursued by the original EAGeR per-protocol analysis.^{9} Another solution, which works regardless of whether the violations are random or structural, is changing the target estimand to one that will not be affected by the positivity violation (e.g., by estimating the average causal effect in the subset of participants with exposure opportunity) or one that does not require the positivity condition.^{6} Incremental effects are one example of the latter type of causal estimand.^{2–4}

Previous articles on the estimation of incremental effects have demonstrated why this approach does not require positivity for identification.^{3}^{,}^{4} The core idea is that, for individuals and times with ^{2}

When controlling for right censoring as we do here, we must also meet the positivity condition for drop out, which is often more likely to hold relative to positivity assumptions for time-dependent treatments.^{4} Finally, it is critically important to note that we must still meet the exchangeability (by controlling for confounding and selection bias due to informative drop out) and consistency conditions to identify the incremental effect.

### Estimating Incremental Effects

To estimate incremental effects, we can use the following steps:^{4}

*Sample Splitting*. Split the full sample into$k$ sample splits (here,$k=2$ ).^{5}^{,}^{15–17}For a given$k$ , define testing (including all individuals selected into split$k$ ) and training (including all individuals not selected into split$k$ ) data sets. Sample splitting not only allows us to avoid any restrictions on the complexity of the nuisance estimators so that we can use arbitrarily complex modern machine learning methods but also makes our algorithm easily parallelizable.^{2}*Estimate nuisance parameters*. Regress the exposure on historical variables (${A}_{t}\sim {H}_{t})$ and the indicator for not dropping out on the exposure and historical variables (${D}_{t}\sim {A}_{t}+{H}_{t}$ ) within the training data and use the output from these models to predict${\widehat{\pi}\phantom{\rule{thinmathspace}{0ex}}}_{t}$ and${\widehat{\omega}\phantom{\rule{thinmathspace}{0ex}}}_{t}$ in the full sample. Then, use these predicted values and the observed exposure to build cumulative weights:

*Estimate the pseudo regression functions*. Starting at the last time point, let${M}_{t+1}={Y}_{t}$ and regress${M}_{t+1}$ on exposure and historical variables (${M}_{t+1}\sim {A}_{t}+{H}_{t}$ ) within the training data and use the model output to predict the outcomes${m}_{t}({H}_{t},\text{}{a}_{t})$ when${a}_{t}=\{0,1\}$ within the individuals who have not dropped out. Then use these predicted outcomes,${A}_{t}$ ,${\widehat{\pi}\phantom{\rule{thinmathspace}{0ex}}}_{t}$ ,${D}_{t}$ ,${\widehat{\omega}\phantom{\rule{thinmathspace}{0ex}}}_{t}$ ,$\delta $ , to compute the pseudo-outcome at the previous time point${M}_{t}$ . Repeat the process of regression in the training set, prediction in the full sample, and computing${M}_{s}$ , for$s=t-1,\dots ,1$ .*Estimate risk*. Within the testing data set, combine the results from the steps above to estimate risk of the outcome for this sample split:

We then repeat these steps for the other sample splits, and the overall estimated risk of the outcome is the average of the estimates obtained within each sample split:

We should note that all steps are only carried out among those who were “observable” at a given time ^{18} even when our regression problems are high-dimensional. To obtain 95% confidence intervals (CI), we can estimate the variance of ^{4} When comparing risks under different ^{19}

### Application to EAGeR

As mentioned above, our outcome of interest in the EAGeR application was the incidence (risk) of pregnancy by 26 weeks of follow-up. Our exposure of interest was compliance with aspirin. We examined ^{9} We ran all regressions using the SuperLearner R package to combine generalized linear models, random forest, and k-nearest neighbors (using default hyperparameters).^{18}

Because our motivating example was the per-protocol analysis of the EAGeR trial, we repeated the above steps to estimate the incremental effect for an exposure defined as “complied with placebo” (

We carried out all analyses using R version 4.1.0 (The R Foundation, Vienna, Austria); code can be found on GitHub (https://github.com/jerudolph13/inc_effect_eager).

## RESULTS

As illustrated in Figure 1a, the overall proportion of EAGeR participants who complied with assigned treatment in a given week dropped consistently across follow-up, from a high of 96% in week 1 to a low of 45% in week 26. Compliance was relatively similar for each treatment arm, although there were weeks in which the proportion of women who complied with aspirin was notably lower than the proportion of women who complied with placebo. For example, in week 23, 69% of women assigned to placebo complied, compared with 59% of women assigned to aspirin (a difference in proportions of 10%). By 26 weeks of follow-up, 773 women became pregnant, with 403 and 370 of those pregnancies occurring among women assigned to aspirin and placebo, respectively. The observed incidence of pregnancy by 26 weeks (not controlling for informative censoring) was 67% among all participants, 70% among those assigned to aspirin, and 64% among those assigned to placebo.

We summarized how the incidence of pregnancy by 26 weeks changed as we shifted women’s probability of complying with aspirin in Figure 2A. When

Compliance to aspirin | Compliance to placebo | Aspirin vs. Placebo | ||||||||
---|---|---|---|---|---|---|---|---|---|---|

Risk | 95% CI | RD | 95% CI | Risk | 95% CI | RD | 95% CI | RD | 95% CI | |

0.3 | 78 | 71, 85 | 1.4 | −4.4, 7.1 | 81 | 76, 86 | 4.4 | 0.67, 8.1 | −3.0 | −10, 4.1 |

0.5 | 76 | 72, 81 | −0.29 | −3.2, 2.6 | 78 | 74, 83 | 1.7 | −0.75, 4.2 | −2.0 | −6.2, 2.1 |

0.7 | 76 | 72, 79 | −1.0 | −2.5, 0.42 | 77 | 73, 80 | −0.07 | −1.4, 1.2 | −0.93 | −3.3, 1.4 |

1.0 | 77 | 74, 80 | Ref. | Ref. | 77 | 74, 80 | Ref. | Ref. | 0.01 | −0.62, 0.64 |

1.5 | 80 | 77, 83 | 3.2 | 1.7, 4.8 | 80 | 77, 84 | 3.6 | 2.1, 5.1 | −0.34 | −2.4, 1.7 |

2.0 | 83 | 79, 87 | 6.4 | 3.8, 9.0 | 85 | 81, 89 | 8.0 | 5.4, 11 | −1.5 | −5.2, 2.1 |

2.5 | 86 | 82, 91 | 9.4 | 6.0, 13 | 89 | 84, 93 | 12 | 8.7, 15 | −2.6 | −7.4, 2.2 |

3.0 | 89 | 84, 93 | 12 | 7.8, 16 | 92 | 87, 97 | 16 | 12, 20 | −3.7 | −9.4, 2.0 |

We saw a similar pattern in our results when we shifted women’s probability of complying with a placebo, although the increase in pregnancy incidence at

## DISCUSSION

In this study, we estimated longitudinal incremental effects in the EAGeR trial to assess how an intervention to shift probabilities of complying with aspirin and complying with placebo impacted the incidence of pregnancy by 26 weeks of follow-up. In doing so, we sought to provide information that would answer a similar question as a standard per-protocol analysis but in a manner that would not be vulnerable to the random nonpositivity we observed in our data. We estimated that the incidence of pregnancy steadily increased as we increased women’s probability of complying and changed little if we decreased the probability of complying; however, the results were nearly identical regardless of whether women were complying with aspirin or placebo.

The similarity in results seen for the aspirin and placebo exposures seems to indicate that aspirin (even when taken regularly) has little estimated impact on the incidence of pregnancy by 26 weeks in the EAGeR sample. Both the original intention-to-treat and per-protocol analyses reported small increases in incidence of pregnancy for the aspirin arm relative to placebo arm.^{8}^{,}^{9} Potential reasons for the differing results include that here we simply targeted a different estimand than the original per-protocol analysis and that we used flexible machine learning methods, rather than parametric models. Nonetheless, we saw that incidence of pregnancy steadily increased as we increased women’s probability of complying with either treatment. This finding suggests that the act of complying with and staying involved in the EAGeR trial mattered more for pregnancy incidence than the treatment being taken. The results seen here for both treatment arms demonstrate why the per-protocol effect usually has as its comparator “always complied with placebo.” The goal is to isolate the effect of the drug’s active ingredient. Having a comparator group with the same, perfect level of compliance as the active treatment arm controls for the impact of behaviors related to complying with the trial protocol.^{10}

Thinking beyond this particular application in EAGeR, there are several important advantages to estimating incremental effects in epidemiologic analyses—many of which we have already mentioned. First, the intervention proposed by incremental effects is interpretable and realistic. The average causal effect imagines one could intervene to make everyone in a given sample exposed and unexposed across all time points, which in some contexts would be infeasible or impossible. In contrast, the intervention implied by incremental effects instead simply increases or decreases everyone’s probability of being exposed, which mirrors the sort of impact one could achieve via many public health interventions. Second, this approach does not require the positivity assumption to interpret the risk obtained in observed data as the counterfactual risk of the outcome that would be seen if we intervened to shift everyone’s probability of being exposed. This can make an estimation of incremental effects an attractive method to use in analyses where nonpositivity due to either structural or random violations is likely.^{6} Third, incremental effects can be estimated using a double-robust approach implemented with machine learning algorithms, which means it can be less vulnerable to statistical model misspecification bias than other approaches that rely on parametric models (e.g., g-computation, inverse probability weighting, or even double robust estimation of marginal structural models).^{5}

There are, however, some caveats in the use of the incremental effect as a causal estimand. In particular, the incremental effect will not answer every research question. Causal estimands and the estimators used to target them should only be chosen if they appropriately answer the scientific question of interest. In particular, incremental effects capture natural increases and decreases in treatment propensity relative to the observational setting; if personalized optimal treatment regimens are of interest, or if treatment assignments really can be finely controlled (e.g., if all in a population can feasibly be treated), then incremental effects may not be the most useful estimand. Furthermore, while we do not need to meet the positivity condition to identify incremental effects, exchangeability and consistency conditions must still be met to interpret one’s estimate as causal. As in any other analysis, these assumptions cannot be tested in the data and must be made solely on background knowledge. Finally, if everyone in the population has zero or one probability of treatment, then no method—even estimation of incremental effects—will be able to estimate a meaningful contrast without extrapolation. Our results could also be sensitive to the number of sample splits used. However, this limitation is not unique to our analysis but may be mitigated in larger sample sizes or by making assumptions about the empirical process conditions, which would impose strong restrictions on the complexity of our nuisance estimators.^{17}

Despite these limitations, the estimation of incremental effects is a novel approach that provides results that are both highly interpretable and robust. The method described here is thus likely to be attractive for many analyses of epidemiologic studies.

## REFERENCES

**Keywords:**

Aspirin; Causal inference; Incremental effect; Positivity; Pregnancy