Couples are heterogeneous in their fertility. About 30% of those discontinuing effective contraception conceive in the first menstrual cycle, about 20% of those remaining conceive in the next, and a smaller fraction of those remaining conceive in the third.^{1} That rapid decline in conception rates is not a time effect (we are all aging, but not that fast) but reflects selective removal of the relatively fertile couples, who conceive quickly and are no longer among the couples still trying. We would like to know what exposures—other than age and the timing and the frequency of sexual intercourse—underlie that heterogeneity.

Time-to-pregnancy (TTP) studies offer a convenient way to assess the effects of exposures on fertility^{2} but must be analyzed in a way that can both account for and fully exploit that heterogeneity. We here consider four designs that are used for TTP studies and reconsider a previously proposed analytic method based on modeling the heterogeneity in fecundability via a beta distribution.^{3} Although more informative models have been developed for designs that also collect detailed data on ovulation and intercourse,^{4–7} we here only consider TTP studies that do not collect that somewhat intrusive level of detail.

The article is organized as follows. We shall assume that each enrolled couple participates just once, and that the exposure to be studied is fixed during the attempt time to be analyzed and does not have an effect that changes during that interval. In the first section, we review the background of designs and analytic approaches for TTP studies. We then describe how to apply an inverse-link model for an exposure that may influence fecundability (the cycle-specific probability of conception). The ratio of mean fecundabilities serves as a convenient parameter for summarizing the effect of a dichotomous exposure. We derive an expression for the standard error of that ratio, with adjustment for covariates. We will describe the simulation scenarios we use to assess findings under three approaches to analysis and then provide results from simulations. We conclude with some extensions and cautions.

## BACKGROUND

Four different designs are used to study time to pregnancy. With the retrospective design, women carrying an intended pregnancy are enrolled and asked how long it took to conceive, based on when they discontinued contraception, as was done in a Norwegian study (MoBA).^{8} A logistically more challenging approach, the “incident cohort” design, enrolls women when they are just about to discontinue contraception and studies them prospectively.^{9}^{,}^{10} Alternatively, in a “prevalent cohort” design^{11}^{,}^{12} women who have recently discontinued contraception are also eligible and are recruited and then followed, with entry into the cohort treated as the left-censoring event that initiates follow up. A fourth design, described by Weinberg and Gladen^{3} and further developed and implemented by Keiding and others,^{13–15} is a cross-sectional study where women are randomly surveyed once and those currently not contracepting are asked to recall how long ago they discontinued contraception, that is, their “ongoing attempt” time. This approach, also known as the “current-duration” design, does not under-represent infertile couples (as does the retrospective approach), and if fecundabilities are beta distributed, the inverse-link model we will describe can be applied to current-duration data, where the cycle at which the couple is sampled is treated as the event-time outcome, but with different beta parameters.^{3}

A random sampling of couples and accurate reconstruction of attempt times present challenges for all four designs.^{11}^{,}^{15}^{,}^{16} To improve data quality, analyses usually censor follow-up at some maximum number of cycles. Couples who have been trying for more than a year can probably tell you that fact reliably, but many may have sought medical help once they were clinically considered infertile. Consequently, follow-up typically censors at 12 (or fewer) cycles.

Prevalent cohort approaches often further restrict prior attempt time, to increase the accuracy of recall. Time to Conceive^{11} only permitted up to three previous attempt cycles, and PRESTO^{16} permitted up to six. Such studies are analyzed as prospective time to pregnancy studies, with left censoring at the cycle-time of recruitment^{12}^{,}^{17} and then right-censored again at some maximum, often cycle 12.

Several valid analytic approaches have been developed for TTP studies. The unit of time is unavoidably discrete, as each menstrual cycle provides a single opportunity for conception. Although more complex models have been proposed for broader or more detailed applications,^{18–20} the simplest approaches for TTP, which can be fit using standard software like SAS, allow for heterogeneity across couples in their probability of conception per menstrual cycle by including a baseline probability of conception for each cycle number, which can freely decline over cycle time.^{21}^{,}^{22} Outcomes are simply Bernoulli 0/1 for each couple cycle, hence, because a mixture of Bernoullis is again Bernoulli, the number of conceptions in each cycle is binomially distributed within each exposure group. Fecundability varies both across couples and across cycles within a couple, but all available analytic approaches presume within-couple cycle-to-cycle independence. To assess the effects of exposures, a generalized linear model can be fitted with either a log or a logit “link,” with conception as the outcome. The log-additive formulation, also known as the proportional probabilities model, presumes a constant (across cycle time) fecundability ratio (FR),^{21} while the logit-additive formulation presumes a constant fecundability odds ratio (FOR). Both are discrete-time Cox models. Both presume heterogeneous fecundability and they constrain the moments of the exposure-specific fecundability distributions via the FR and the FOR parameters. The approach recommended by a recently updated textbook on epidemiology in the chapter by Weinberg et al.^{21} is the proportional probabilities (log link) model. There are, however, sometimes convergence issues with that model, and the logit link can be used instead. It is impossible for both the FR and the FOR to be constant across attempt time, as we show algebraically in eAppendix A; https://links.lww.com/EDE/B992, so the choice between the two models matters. Our simulations will further explore the consequences of getting that choice wrong.

## THE BETA MODEL

A third approach, which is a parametric model originally proposed by Weinberg and Gladen in 1986,^{3} assumes couple-specific fecundability is beta distributed in the population. If the mean parameter is μ and the shape parameter is θ, the density function for fecundability, p, is B[μ/θ, (1 − μ)/θ]^{(−1)} p^{(μ−θ)/θ} (1 − p)^{(1−μ−θ)/θ}. After j cycles without conception, the fecundabilities for couples still trying are reduced but remain beta distributed, with mean and shape parameters μ/(1 + jθ) and θ/(1 + jθ).

The beta parameters underlying TTP can be estimated^{3} by applying a generalized linear model, where the specified “link” is the inverse function, f(x) = 1/x (in SAS or STATA this is a “power” link with exponent −1). The base linear model is simply c + d(j − 1), where c > 1, d > 0 and j is the ongoing cycle number of the attempt, with 1 being the first cycle off contraception. Under that model, 1/Pr(conception | still trying at cycle j) = c + d(j − 1). Based on this model, the population mean fecundability μ is 1/c (the probability of conception at cycle 1) and the beta shape parameter θ is d/c. The shape parameter governs the decline in the conception rate over cycle time. (For those preferring the α and β parameterization for the beta, α = 1/d and β = (c − 1)/d.)

Consider a dichotomous exposure E, coded as 0/1. To now include possible effects of E, the conception rate for cycle j is modeled as:

Thus, exposed and unexposed couples have two different beta distributions. Let

When potential confounders need to be considered, one can summarize the effect of an exposure on fertility by first recoding continuous covariates by centering them at their mean (or median). The adjusted RMF (still

Now consider an ongoing-attempt time study. If the fecundability distribution is beta, then the distribution of times to pregnancy is beta-geometric. After taking length-biased sampling into account, the observed ongoing-attempt times (measured as the number of menstrual cycles since contraception was discontinued) are also distributed beta-geometric,^{3} with parameters algebraically related to the fecundability parameters μ and θ for the source population. For a random sample of ongoing-attempt times, the beta mean is the probability of being sampled during cycle 1, and is (μ − θ)/(1 − θ) and the shape parameter is θ/(1 − θ). Thus, the inverse-link generalized linear model can also be applied to an ongoing-attempt design, except that the event time being modeled is now the cycle number at enrollment in the survey. If c_{0}, d_{0}, c_{1}, and d_{1} correspond to the beta parameters for fecundability for the unexposed and the exposed, respectively, then with a model for the ongoing-attempt time where one again computes the ratio of the corresponding estimated means, one will instead be estimating the ratio

(c_{0} − d_{0})(1 − d_{1})/[(1 − d_{0})(c_{1} − d_{1})]. One can show (eAppendix C; https://links.lww.com/EDE/B992) that this is now algebraically a meaningful fecundability parameter: the average time to pregnancy for the unexposed divided by that for the exposed.

In an actual study, some fraction of the population recruited will be sterile, that is, having 0 fecundability (and infinite TTP), and the inclusion of those couples would bias parameter estimation. However, a large study of Europeans estimated that only about 1% were sterile, so for the other three designs, this would not be a large source of error.^{23} Methods of analysis to account for that have been published.^{3} In practice, for a prospective study one could do sensitivity analyses by excluding from the non-conceivers 1%–2% of the original sample, and refitting the model.

As a data example, we reconsider the retrospective TTP data shown in the Table and originally provided by Baird and Wilcox.^{24} Comparing smokers with nonsmokers using the beta model, the estimated RMF is 2.45/3.63 = 0.67, that is, a 33% reduction in mean fecundability associated with smoking. The corresponding population-based FOR would be *P* = 0.15). There seems to be some digit preference in these retrospective data, with TTP of 12 being favored, presumably based on “it took me about a year.” When we instead truncated at 11 cycles, the *P*-value for the beta fit became 0.26 (χ^{2} of 21.3, on 22-4 = 18 df). The digit preference in these retrospective data has been pointed out by Ridout and Morgan,^{25} who then provided a way to adjust for it under the beta model.

Cycle | Smokers | Non-smokers |
---|---|---|

1 | 29 | 198 |

2 | 16 | 107 |

3 | 17 | 55 |

4 | 4 | 38 |

5 | 3 | 18 |

6 | 9 | 22 |

7 | 4 | 7 |

8 | 5 | 9 |

9 | 1 | 5 |

10 | 1 | 3 |

11 | 1 | 6 |

12 | 3 | 6 |

>12 | 7 | 12 |

## APPROACHES USED FOR SIMULATIONS

For each of the 1000 simulated populations, we simulated studies that included 300, 600, 1000, or 2000 couples, with 15% of the population assumed to be exposed and 0% assumed to be sterile. (Note that this exposure could affect either the male or the female.) Details are provided in eAppendix D; https://links.lww.com/EDE/B992. In sampling couples for a prevalent cohort study, we allowed for the length bias in the identification of eligible couples and each simulated prevalent cohort study then followed them forward (as in PRESTO) after first ascertaining exposure status and the current duration of trying. Couples destined to require more than 100 cycles were set at TTP = 101, a restriction with little effect on the analysis, which is censored at cycle 12.

To avoid imposing distributional assumptions in generating the simulated data, we empirically assigned cycle-specific unexposed fecundabilities by fitting a spline to observed cycle-specific conception rates taken from a large prospective study.^{26} We then used either a constant FR model or a constant FOR model, treating those smoothed cycle-specific rates as baseline. Next, we applied the three analytic approaches to both sets of simulated data, using standard R software and censoring analyses after cycle 12. For these simulations, the commonly used semi-parametric models (FR and FOR) have an inherent advantage over the inverse-link model, because by construction one or the other of them is correct. We also carried out simulations where neither the FR nor the FOR was constant over time.

We addressed several questions about the method and design performance using simulations based on all four designs. We also assessed the performance of a commonly used test for constancy of the FR or the FOR over cycle time, where one includes an interaction (product) between the exposure and cycle time. The inverse-link approach does not require constancy of either the FR or the FOR, but instead estimates both based on the population mean fecundabilities. We assessed the coverage of nominally 95% confidence intervals for those two parameters under all three analytic approaches. Power for the inverse-link approach was based on a 2 df likelihood ratio test. Power for the log and logit link analyses was based on Wald test Z statistics.

## RESULTS

Figure 1A–L shows confidence interval coverages based on all four designs when the constancy of FR or FOR over cycle time is satisfied or not. We generally achieved coverage for the two prospective designs for the inverse-link model but not for the log-link model or the logit-link model, falling below 95% under nonconstancy. As expected the coverage tended to fail for the retrospective design, which undersamples infertile couples, but it was more often achieved by the inverse-link model. Coverage for the ongoing-attempt design only works for the inverse-link model, in that, with that cross-sectional design, there is no definable parameter estimated by the FR or the FOR approaches. Figure 2 shows that there was, however, a power price to pay for the improved robustness gained by using the inverse-link approach. The residual mean squared errors were also typically larger for the inverse-link approach (eTable 1; https://links.lww.com/EDE/B992).

If one wishes to use either the FR or the FOR approach, constancy of one or the other (not both) needs to be assumed. The popular Wald test of constancy, based on the coefficient of the product of exposure and time, showed power never exceeding 0.12 (eTable 2; https://links.lww.com/EDE/B992). Thus, that test of constancy is weak. Although FR and FOR are typically quite different, the data evidently can offer little guidance on which, if either, should be used.

## DISCUSSION

TTP studies offer a powerful approach for identifying exposures that can influence human fertility, but they have to be carefully analyzed. Two semi-parametric models that are typically used are based on either the log link or the logit link generalized linear model, and both require their effect parameters to be constant across attempt time. We showed (eAppendix A; https://links.lww.com/EDE/B992) that it is impossible for both the FR and the FOR to be constant across attempt time, so at most one of those two models can be correct. Our simulations showed that the wrong choice of model will invalidate the confidence interval coverage. Unfortunately, our simulations also showed that the goodness-of-fit test usually applied for assessing constancy has almost no power, so the decision about what model to use is important but largely arbitrary. The inverse-link approach we considered in this article provides an approach that enables the estimation of both the FR and the FOR based on population averages, with no constancy assumption required. Its use implies a small loss of power but offers improved generality and validity. Under an ongoing attempt design, unlike the other generalized linear models, the inverse-link model allows the estimation of a meaningful parameter, namely the average time to pregnancy for the exposed divided by the average for the unexposed. This is particularly remarkable given that under that study design no couple’s actual time to pregnancy is ascertained.

Generalizations can be readily developed. While we have focused on a dichotomous exposure for simplicity, an extension to continuous exposures is straightforward. The possible existence of a sterile subpopulation can also be accounted for in a prospective study (but not in a retrospective study, which enrolls pregnant women and excludes infertile couples by design) by means of the Expectation/Maximization algorithm, provided the software can handle fractional denominators and outcomes.^{27} After each cycle, we still have the beta, but now mixed with a mass at 0, where the relative fraction of sterile increases over time, ultimately dominating what’s left. However, in a large European study, the proportion of sterile was estimated to only be about 1%,^{23} which should have a very small biasing effect on inference. If this issue is of concern, one can carry out a sensitivity analysis where about 1% of the unsuccessful couples are removed from the data and the analysis repeated.

The reduction in power for the inverse-link model was small (Figure 2), and confidence interval coverage was quite reliable. However, there is a price to pay in the precision of estimation. In simulations, the residual mean squared errors were larger for the inverse-link-based estimates (eTable 1; https://links.lww.com/EDE/B992) than for either of the FR or FOR models, provided their corresponding required constancy assumption was satisfied. We also have not (other than the usual right and left censoring) allowed for interval censoring in the assessment of TTP, as that is beyond our current scope.

Time to pregnancy has been used as a biomarker possibly associated with other pregnancy outcomes such as preterm birth as well as long-term outcomes like cardiovascular disease.^{28}^{,}^{29} For such applications, fertility is sometimes summarized by categorizing time to pregnancy. The beta-based inverse-link model offers a convenient alternative. If a couple conceived one planned pregnancy and that was in cycle k, their estimated mean fecundability based on the beta parameters (c, d) for their population would be (1 + d)/(c + kd). Multiple pregnancy history can also be taken into account: If a couple conceived r planned pregnancies in a total of k attempt cycles, their estimated fecundability would be (1 + rd)/(c + kd).

Results for the ongoing-attempt time design showed surprisingly good power (Figure 2). However, both the implementation and the interpretation of such a study can be fraught with issues^{30} related to the quality of data on how long ago they discontinued contraception, and possibly exposure- and TTP-differential recruitability. Such studies also do not fully account for medical interventions, which often begin after a year. Also, exposures reported after a long delay in conceiving will be subject to reverse causality to the extent that people change their habits out of concern about their fertility.

In summary, for the four designs currently in use, the inverse-link model offers a flexible and powerful approach to assessing the effects of exposure on fertility. It can easily be implemented using standard software, remains valid even if the usual ratio constancy assumptions are violated, and allows the estimation of a meaningful parameter under an ongoing-attempt design.

## ACKNOWLEDGMENTS

We thank Anne Marie Jukic for asking good questions and Olga Basso, Shanshan Zhao, and Donna Baird for their comments on an earlier draft.

## REFERENCES

**Keywords:**

Beta distribution; Beta-geometric; Fecundability; Fertility; Time-to-pregnancy