One of the potential risks of labor induction is failure as defined by unsuccessful induction leading to delivery by cesarean delivery instead of vaginally. When compared week-by-week, the likelihood of cesarean delivery has been reported to be significantly higher when labor is induced than when spontaneous.1–3 A perspective has been put forth by Caughey et al,4 however, that women laboring during the same week as women induced are not the appropriate comparison group when studying the success of labor induction because labor can be induced by deciding on a week, but by definition, spontaneous labor happens whenever it happens, not necessarily the same week that labor might be induced. The tenet is that the appropriate comparison group instead is all women delivering after the index week of induction. It has been reported that, in this cumulative light, labor induction overall is no more likely to fail than is spontaneous labor, and actually is less likely to fail in nulliparous women.4
The 2006 study by Caughey et al4 was from a population at one institution from 1986 to 2001 when the national cesarean delivery rate was much lower than now. It is unclear whether the findings apply now or can be generalized. In addition, it is possible that their expectant group may not have been properly defined, because it included only women at higher gestational age than those induced. Because the risk of cesarean delivery increased with increasing gestational age, this potentially biases their results in favor of induction.
The objective of this study was to determine how changing the definition of the group to which induction is being compared changes the association of labor induction and increased cesarean risk. Using a regional perinatal database, the hypotheses were that 1) an expectant group that includes women at gestational ages at or greater than that of those induced will have a lower cesarean delivery risk than an expectant group that starts the week after the induction group’s age, and 2) regardless of the definition of the expectant group, the likelihood of cesarean delivery will be higher in the induction group.
The Statewide Perinatal Data System is a validated electronic birth-certificate database available for analysis through the New York State Department of Health.5 Trained birth-certificate coders enter data on a Web-based application using standardized definitions, largely in check-box format. The Finger Lakes region data set comprises a nine-county region within Upstate New York that includes 13 hospitals (two level III, one level II, and 10 level I), for a total of approximately 14,500 deliveries per year. The data are completely deidentified in accordance with Health Insurance Portability and Accountability Act regulations, and this study used the data set from January 2004 to March 2008. The study was considered exempt by the University of Rochester Research Subjects Review Board.
The database was subset to include only women with singleton, vertex presentations who labored and delivered between 37 0/7 and 42 6/7 weeks (rounded down). Women with scheduled cesarean deliveries, with one or more previous cesarean deliveries, or presenting for artificial rupture of membranes were excluded. Induction was defined as initiation of uterine contractions to promote delivery before spontaneous onset of labor. Induction was by medical means (oxytocin or prostaglandin); spontaneous labor was labor in the absence of pharmacologic or mechanical initiation. Women who presented with contractions and required augmentation were not classified as inductions. Preinduction cervical ripening is common, but neither it nor Bishop score are recorded per national birth-certificate standards; thus, this potentially important information was not available.
Analysis was performed in three ways: cesarean delivery after induction compared with after spontaneous labor by week (week-to-week comparison group), induction at a given gestational age compared with expectant management of all other women after that gestational age (all-above comparison group), and induction at a given gestational age compared with expectant management of all other women at or after that gestational age (at-or-above comparison group). Univariable comparisons were by χ2 testing, followed by logistic regression in which induction was entered in the first block, followed by sequential adjusting for possible confounders. Medical confounders (entered in the second block) included nulliparity and high-risk factors (defined as hypertension, preeclampsia, eclampsia, diabetes, or small for gestational age). Although a gestational age at delivery of 41 weeks or greater is associated with increased risk of cesarean delivery, it was not included as a high-risk factor because, by definition, no 41-week–pregnant women would be included in the groups undergoing induction before 41 weeks, but they would be included in all expectant groups, thus biasing against induction. Demographic confounders were entered in backward stepwise fashion in the third block and included maternal age, body mass index (BMI), race, insurance status, employment and educational status, and year of delivery. Year of delivery was included in the regression to control for period effects.
The primary outcome of interest was cesarean delivery. Secondary outcomes included 5-minute Apgar score less than 7 and admission to the neonatal intensive care unit (NICU). Data were analyzed using SPSS 16 for Mac (SPSS Inc., Chicago, IL).
There were 61,705 women in the total data set. Planned exclusions left 40,014. There were missing data (mostly BMI) for 1,867 women, so that the remaining 38,147 women with complete data were available for analysis. Of these, 18,136 (47.5%) were nulliparous and 20,011 (52.5%) were primiparous or multiparous (for simplicity, collectively referred to hereafter as multiparous). Seventy-six percent of the women were Caucasian, 14.6% were African American, and 38.1% of the population used Medicaid as their primary payor. The induction rate was 29.8% (33.8% for nulliparous women and 26.3% for multiparous women), and 19.2% of the women had high-risk factors as defined in the Methods section (22.1% for nulliparous women and 16.6% for multiparous women). The primary cesarean delivery rate in this population was 13.3% (23% for nulliparous women and 4.5% for multiparous women).
Univariable comparisons are shown in Table 1. Although there were small differences in gestational age, birth weight, maternal age, and racial composition between the induction and spontaneous labor groups (all P<.001), they were not great enough to be of obvious clinically importance. Women in the induction group had higher BMIs, more often were nulliparous and had medical risk factors, and were slightly less likely to use Medicaid for insurance.
Table 2 shows the cesarean rates by weeks, total, and parity. Using logistic regression with contrasts (a technique allowing comparison of adjacent levels of a categorical variable, in this case, weeks), the primary cesarean delivery rate for all women in aggregate increased significantly each week from 38 to 39, from 39 to 40, and from 40 to 41 weeks. The decrease from 37 to 38 weeks was not statistically significant, nor was the increase from 41 to 42 weeks. This pattern persisted after controlling for induction of labor, parity, high-risk status, and demographic variables (P<.002).
Table 3 shows the number of women per week and the percentages induced, subset by parity. By week, the percentage of cesarean deliveries after induction was always greater than that after spontaneous labor in nulliparous women, and until 41 weeks in multiparous women. Except for multiparous women at 37 weeks (P=.08), this pattern persisted after controlling for high-risk status and demographic variables (individual significant odds ratios not shown, but ranged from 1.41 to 3.12 depending on week, parity, and degree of adjustment; P<.03).
Comparisons between women induced at a given week and all women followed expectantly from the next week onward (all above, whether eventually induced or presenting in spontaneous labor) are shown in Table 4. A series of logistic regressions was used to calculate unadjusted odds ratios relative to spontaneous labor for the risk of cesarean delivery after induction, including the entire population (all parity) from each week onward in the data set (total). Each odds ratio was recalculated adjusting for parity and high-risk status, followed by further adjustment for demographic factors. To further define the role of parity, the adjustment sequence then was repeated, first only for nulliparous women and then only for multiparous women. With the exception of nulliparous women at 37 weeks, all unadjusted odds ratios were significantly greater than 1.00 at 37 through 39 weeks, although they were not significantly different from 1.00 at 40 and 41 weeks. Adjusting for high-risk factors did not change the pattern except that the odds ratio for multiparous women at 39 weeks no longer was significant. After adjusting for demographics, however, the differences no longer were significant at any gestational age except for a 1.17 odds ratio in the 38-week total group.
For the variation using a comparison group consisting of all women at or above the gestational age of the index induction group, the results are shown in Table 5. The unadjusted odds ratios for cesarean delivery after induction were all significantly elevated relative to after spontaneous labor, with the exception of multiparous women at 41 weeks. Adjusting for high-risk status and demographic variables did not change the patterns except that the difference for nulliparous women at 37 weeks no longer was significant. In conjunction with the data in Table 3, the unadjusted odds ratios in Table 5 translate into between 5 and 10 extra cesarean deliveries per 100 labor inductions, depending on week of gestation (percent cesarean delivery in index induction week minus percent cesarean section in expectant group). Because the odds ratios decreased by approximately 15% after adjustment, the adjusted attributable effect is estimated as between 4 and 8 extra cesarean deliveries per 100 inductions, ie, approximately 1 to 2 additional cesarean deliveries per 25 inductions as compared with expectant management. Given approximately one million inductions per year in the United States, if 50% are nonurgent, this could translate to as many as 40,000 potentially avoidable cesarean deliveries per year, many of which would become repeat cesarean deliveries with subsequent pregnancies.
When the same process was performed for 5-minute Apgar scores less than 7 in the aggregate groups (ie, all gestational ages together), there was a significant association between labor induction and low 5-minute Apgar score when unadjusted or when adjusted for parity, high-risk status, and gestational age (P<.03). The association became nonsignificant when adjusted for demographics (P=.12). Using the all-above comparison group, there was no significant association with induction from weeks 37 to 40, although there was a marginally significant (P=.054) decrease in the likelihood of 5-minute Apgar score less than 7 associated with labor induction at 41 weeks compared with 42 weeks. Using the at-or-above comparison group, there were no significant adjusted associations between induction compared with expectant management in terms of low 5-minute Apgar score.
In contrast to Apgar score, induction was associated with higher rates of NICU admission compared with spontaneous labor, before and after adjustment for gestational age, parity, high-risk status, and demographic variables (odds ratios 1.46–1.68, P<.001). Using the all-above or the at-or-above comparison group, the pattern persisted until 41 weeks, at which time it became nonsignificant regardless of comparison group.
The degree of cesarean delivery risk associated with labor induction depends on how the control group is defined. The traditional comparison of women induced in a given week and those laboring spontaneously that same week may bias against induction. For those women neither induced nor spontaneously laboring in a given week, some will labor spontaneously and some will be induced in upcoming weeks. Some will deliver vaginally and some will deliver by cesarean delivery. Caughey et al4 reported a lower risk of induction-related cesarean in nulliparous women when defining the expectant group as women delivering the week after the induction group, but this excludes the women who did labor spontaneously the same week as those induced, and thus biases against spontaneous labor. Several retrospective studies of induction based on identification of women at risk of cephalopelvic disproportion or uteroplacental insufficiency reported that greater use of prophylactic inductions was associated with lower cesarean delivery rates,6,7 but in a randomized study by the same investigators, the difference was not statistically significant.8
In this retrospective database analysis of term, singleton, vertex women without prior cesarean deliveries, unadjusted cesarean delivery rates were higher after medically induced labor than after spontaneous labor, regardless of the comparison group (same week, or either of two variations of expectant management). The association was strongest comparing spontaneous to induced labors in a given week, and persisted even with multiple adjustments for possible confounding factors when the comparison group was redefined to include all remaining women not induced during the index week (the at-or-above group). When the comparison group was defined as expectantly managed women beginning the week after the induction group (all-above group, the comparison group in the study of Caughey et al), the unadjusted odds ratios for cesarean delivery associated with induction were significantly above 1.00 at 37 to 39 weeks, although in this group the association was diminished and eventually became nonsignificant at all weeks with sequential adjustment for possible confounding factors. Induction never was significantly protective. Overall, the association was strongest for nulliparous women and when risk factors were absent, but also was observed in multiparous women between 37 and 40 weeks in the week-to-week and at-or-above groups.
These findings differ from those of Caughey et al,4 possibly reflecting the different time periods (2004 to early 2008 instead of 1986 to 2001), a shorter time period (4 years instead of 15 years), a larger population (38,147 instead of 19,377), being from multiple hospitals (13 instead of 1), and with a different racial composition. In that sense, the findings of the present study may be more generalizable to current obstetric practice. The findings do not invalidate the concept of using expectant management as the appropriate comparison group when examining outcomes of induced labor rather than using spontaneous labors occurring the same week as the index induction. This reasoning has been used convincingly in studies on perinatal mortality by gestational age, in which the optimal method of comparing stillbirth rates may not be week-to-week rates, but the cumulative rates after each week.9 Although comparisons of cumulative perinatal mortality would, by definition, have to begin the week after the index week, this logic would not necessarily apply to inductions, because some women will begin labor spontaneously during the index induction week, and thus that week should be included in the expectant comparison group.
Regarding neonatal outcomes in the current study, the data in the birth certificates is limited, but there were no consistent associations with induction and low 5-minute Apgar scores in either of the two expectant groups, although after induction there was an increased rate of NICU admission. The reasons for this latter finding are unclear, and it persisted after adjusting for presence of the most common maternal risk factors (hypertension, diabetes, and small for gestational age). A disadvantage of using the live birth–certificate database is that fetal deaths are not accessible; both of the expectant groups would be expected to have slightly higher numbers of stillbirths compared with the induction group.
Strengths of this study include the large size, use of trained coders using standardized definitions, consistency of results across various definitions of comparison groups, and potential generalizability to broader populations and current practices. Weaknesses include the retrospective nature of birth-certificate studies, possible errors in coding, limited ability to assess and control for all confounding factors, application only to women with vertex singletons at term without previous cesarean deliveries, possible loss of precision from flooring gestational ages, and lack of consistent information about such variables as cervical ripening, Bishop scores, specific reasons for induction, stillbirths, and long-term outcomes. Bishop scoring and cervical ripening are important factors in labor induction. Although specific information was not available about either of these in the birth-certificate database, anecdotal evidence suggests that use of prostaglandins for cervical ripening is widespread in the Finger Lakes region. Because the Finger Lakes regional rates of labor induction and cesarean delivery closely follow national rates (23.6% compared with 22.5% induction, 31.0% compared with 31.1% cesarean in 2006),10 it is unlikely that use of cervical ripening is less common in upstate New York than elsewhere in the country.
Associations found in epidemiologic studies are not always borne out by randomized trials. In large studies, small differences may appear statistically significant when they are clinically unimportant. Most odds ratios in this study were less than 2.0, and thus the absolute differences in cesarean delivery rates between groups were not large and the increase in risk to an individual woman is fairly small. Applied to the yearly number of pregnant women in the United States, however, small differences in risks add up to large numbers of women undergoing cesarean deliveries that may not have been necessary. Given the known risks of cesarean delivery, these numbers are of significant potential consequence. Cesarean delivery, albeit common, is a major surgical procedure and entails risks of hemorrhage, infection, thrombosis, and injury to adjacent organs. As of 2006, more than 90% of women with a previous cesarean delivery will deliver subsequent babies by cesarean delivery,10 so it often is not just a matter of one major surgery if an induction is unsuccessful; it may lead to repeated surgeries, each with attendant risks. It is especially difficult to justify these risks in cases of purely elective induction (ie, for nonmedical indications) in which there is little or no demonstrable medical benefit to outweigh surgical risks should delivery ultimately be by cesarean delivery.
An unexpected finding in this study was the association between induction and cesarean in multiparous women. The association generally has been found to be stronger for first deliveries, although less frequently it has been reported in multiparous women as well. In a previous study of low-risk women in the Finger Lakes region of New York in 1998–1999, elective induction was associated with increased risk of cesarean delivery in nulliparous women but not in multiparous women.1 The current study included high-risk women (although with adjustment for high-risk factors) and was during a later period of time in which both the labor induction and cesarean delivery rates were higher locally and nationally. In addition, the threshold for cesarean delivery may have decreased with increasing medical–legal concerns and with the advent of “patient-choice” cesarean—if a cesarean can be considered without any medical indications, the threshold may be lower when at least a possible indication exists.
Based on a defined regional population, labor induction initially was associated with increased risk of cesarean delivery regardless of the definition of the comparison group. It lessened with progressive adjustment and, in the all-above comparison group, eventually became nonsignificant. When studying outcomes of induction, if one chooses to use an expectant comparison group rather than women spontaneously laboring the same week as a cohort of women induced, the expectant group should include women laboring during the index induction week. Although it is not possible to predict when a woman will enter spontaneous labor, some will in fact labor during the index week, and to exclude them biases against expectant management. Although the association between labor induction and cesarean delivery applies most strongly to nulliparous women, multiparous women also were at risk in this study. Labor inductions should be performed for specific indications, and women should be fully informed of the possible risks, including failed induction leading to cesarean delivery.