OBJECTIVE: To compare induction of labor at gestational age 41 weeks with expectant management in regard to neonatal morbidity. Secondary aims were to assess the effect of these managements on mode of delivery and maternal complications.
METHODS: Between September 2002 and July 2004, postterm women with singleton cephalic presentation and no prelabor rupture of membranes were randomly assigned to induction of labor at 289 days or antenatal fetal surveillance every third day until spontaneous labor. Main outcome measures were neonatal morbidity, operative delivery rates, and maternal complications.
RESULTS: Five hundred eight women were randomly assigned, 254 in each group. No differences of clinical importance were observed in women in whom labor was induced compared with women who were expectantly managed with regard to the following outcomes: neonates whose 5-minute Apgar score was less than 7 (three neonates in the induction group compared with four in the monitoring group, P=.72); neonates whose umbilical cord pH was less than 7 (three compared with two, P=.69); prevalence of cesarean delivery (28 compared with 33, P=.50); or prevalence of operative vaginal delivery (32 compared with 27, P=.49). In the induction group more women had precipitate labors (33 compared with 12, P<.01; number needed to treat was 13), and the duration of second stage of labor was more often less than 15 minutes (94 compared with 56, P<.01; number needed to treat was 7).
CONCLUSION: No differences were found between the induced and monitored groups regarding neonatal morbidity or mode of delivery, and the outcomes were generally good.
CLINICAL TRIAL REGISTRATION: ClinicalTrials.gov, www.clinicaltrials.gov, NCT00385229
LEVEL OF EVIDENCE: I
There is no difference in neonatal or maternal outcome if postterm pregnancies are induced immediately at 41 weeks or monitored expectantly.
From the Departments of 1Obstetrics and Gynecology and 2Paediatrics, 3National Center for Fetal Medicine, St. Olavs Hospital, Trondheim University Hospital, Trondheim, Norway; 4Unit for Applied Clinical Research, Faculty of Medicine and 5Department of Laboratory Medicine, Children’s and Women’s Health, Norwegian University of Science and Technology, Trondheim, Norway; and 6Department of Obstetrics and Gynecology, Sahlgrenska Academy, Gøteborg, Sweden.
The authors thank Dr. Beth Theting for valuable assistance in registering and examining the neonates and Dr. Jon Hyett for critical review of the manuscript.
Corresponding author: Runa Heimstad, Department of Obstetrics and Gynecology, St.Olavs Hospital, Trondheim University Hospital, 7006 Trondheim, Norway; e-mail: email@example.com.
Pregnancies continuing beyond term are associated with higher perinatal morbidity and mortality rates than those delivering at term.1,2 In a United Kingdom study, the rate of stillbirth increased from 0.35 in 1,000 to 2.12 in 1,000 ongoing pregnancies between 37 and 43 weeks gestation, respectively.3 Associated morbidity in postterm births includes an increased risk of fetal distress, shoulder dystocia, labor dysfunction, and obstetric trauma (relative risk 1.09–1.68) and an increase in perinatal complications, such as meconium aspiration, asphyxia, bone fracture, peripheral nerve injury, pneumonia and septicemia (adjusted odds ratio 1.4–2.0).1,4
Strategies that may decrease the risk of adverse outcome include antenatal surveillance and induction of labor. Although a Cochrane review concluded that routine induction of labor at 41 weeks gestation seemed to reduce perinatal mortality (odds ratio 0.20, 95% confidence interval 0.06–0.70),5 induction was associated with other obstetric complications.6,7 In a recent review of term and postterm pregnancies in Norway, we found that postterm pregnancy and induction of labor were independently associated with an adverse outcome.8 A randomized controlled trial comparing induction of labor to continued antenatal surveillance showed no difference in neonatal outcome, but demonstrated a reduction in the cesarean delivery rate when labor was induced at 41 weeks.9 Although these data have had a significant effect on obstetric practice, other groups have in fact shown an increase in the cesarean delivery rate when labor is induced.10,11 Thus, the observed increase in cesarean deliveries in many developed countries may partially be explained by a change in obstetric practice toward more aggressive induction policies of postterm pregnancies.
The aim of this study was to investigate whether induction of labor at gestational age 289 days (41 weeks plus 2 days) reduces neonatal morbidity compared with expectant management. A secondary aim was to assess the effect of induction of labor and expectant management on the mode of delivery and maternal complications.
MATERIALS AND METHODS
Women with a postterm pregnancy who agreed to participate in the study were randomly assigned to either immediate induction of labor (booked the following day) or to antenatal fetal monitoring (every 3 days) while awaiting spontaneous labor. The study took place at St.Olavs Hospital, Trondheim University Hospital, between September 2002 and July 2004. In Norway pregnancies are dated by the 18-week ultrasound scan, and the duration of pregnancy is defined as 282 days (40 weeks plus 2 days). Women received information about the study with their appointment for the routine ultrasound scan, inviting those who went beyond their estimated date of delivery to attend for a postterm follow-up at 289 days of gestation. The clinics ran 5 days per week, so effectively, women were seen at 289±2 days.
The study included women with singleton pregnancies who had had their routine ultrasound scan and delivery at St. Olavs Hospital and who spoke fluent Norwegian. The study was confined to pregnancies with a cephalic presentation with no history of prelabor rupture of membranes. The study was approved by the Committee for Medical Research Ethics of Health Region IV, Norway and all participants gave written consent. Computerized randomization to either antenatal fetal monitoring or induction of labor was performed before postterm pregnancy assessment occurred and was done by the University Hospital Clinical Trials Office using blocks of 16 with no stratification.
Women assigned to both arms of the study had the same baseline assessment: an ultrasound scan (estimated fetal weight and amniotic fluid volume), a cardiotocogram, and a clinical vaginal examination. For women assigned to continued antenatal assessment, induction of labor was arranged if the cardiotocogram recordings were abnormal, the estimated fetal weight was less than 2 standard deviations, or oligohydramnios was found (amniotic fluid index less than 5 cm or single deepest pocket less than 2 cm). If these investigations were reassuring, they were reassessed every third day until spontaneous delivery occurred or until labor was induced on day 300.
Women who had a favorable cervix (Bishop score 6 or more) were induced by amniotomy followed by oxytocin (Syntocinon, Novartis, East Hanover, NJ) infusion. Women with an unfavorable cervix (Bishop score less than 6) had cervical priming using misoprostol (prostaglandin E1 analog, Cytotec, Searle, Chicago, IL, 50 mcg pessary encased in a gelatin capsule) at 6-hour intervals in the posterior fornix. A maximum of four doses was given in a 24-hour period, and cervical priming was continued for a maximum of 2 days. Once the cervix was favorable, amniotomy and oxytocin infusion were used. Women with a uterine scar were induced with 0.5 mg dinoprostone (prostaglandin E2, Minprostin endocervical gel, Pfizer, New York, NY) given intracervically every 12 hours.
Women being induced, or being assessed on admission in labor, had a cardiotocogram. If this was abnormal, or if meconium was seen after rupture of membranes, then continuous electronic fetal monitoring was recommended. Antenatal, intrapartum, and postnatal data were collected on a single chart that accompanied the patient through this period and was completed by the staff contemporaneously. The charts were designed to allow automatic optical recognition and transmission to a computer database based in the University Clinical Trials Office. The process of data transmission was checked for error by two of the authors (R.H. and E.S.).
Neonatal morbidity was primarily defined by assessing a series of relevant outcomes (Table 1). Information about the presence of meconium, birth weight, crown-heel length, Apgar scores, umbilical cord pH, and base excess and the need for resuscitation were recorded immediately after delivery. Supplementary medical information was obtained from the routine pediatric examination on the first or second day, and if admitted, from the neonatal intensive care unit (NICU).
Neonatal morbidity was scored by evaluating the degree of deviation from the potential of a perfect outcome for each newborn. We defined a perfect outcome as being an infant with a birth weight of 3.8 kg and a Ponderal Index of 2.88 (a measure of weight relative to length obtained from 180 healthy newborns with gestational age 287 days or more acting as controls in a study of preeclampsia).12 Other optimal features for outcome were considered to be 1- and 5-minute Apgar scores of 10, umbilical cord pH 7.40 with base excess equal to 0 (zero), and no medical complications or need for treatment. To compare the study groups quantitatively we assigned a priori weights to the outcome variables (Table 1) based on clinical judgment and consensus among the researchers. The sum in each neonate constitutes a Neonatal Morbidity score (see the Box), which increases with increasing morbidity. One Neonatal Morbidity unit corresponds to, for example, the presence of meconium, a one-point decrease in the 5-minute Apgar score, or a pH decrease of 0.1 (an example of calculation is given, see the Box). We further calculated the Canadian Multicenter Post-term Pregnancy neonatal morbidity index developed by Hannah et al,9 in which morbidity was defined when the upper 2- percentile was exceeded.
Severe perineal lacerations were defined as third- and fourth-degree perineal laceration during delivery. Maternal hemorrhage was defined as blood loss more than 500 mL at delivery. Uterine contraction abnormalities were defined as prolonged first stage of labor (less than 1 cm cervical dilatation per 1.5 hours), prolonged active second stage of labor (longer than 1 hour) and short active second stage of labor (less than 15 minutes). Precipitate labor was defined as a total length of labor less than 3 hours.
Continuous variables are presented as mean with standard deviation (SD), ordinal variables as median with range, and categorical variables as numbers and percentage. Unless otherwise stated, the study groups were compared according to intention to treat. Student t test was employed for continuous variables, the Mann-Whitney nonparametric test for ordinal variables with exact implementation in case of ties,13 and Fisher exact tests with mid P values for 2-by-2 tables. The number needed to treat was calculated as the inverse of the absolute risk reduction. SPSS 12 (SPSS Inc., Chicago, IL) was employed for the descriptive analyses, and the statistical software R14 for the remainder.
Missing observations were handled as follows: the lowest weight during 1st week was set as equal to birth weight, whereas pH and base excess was imputed using the EM algorithm of the R package Norm.14 The R package Epitools was used to calculate relative risks with 95% confidence intervals. Two-tailed tests were used throughout, and a P value less than .05 considered statistically significant, with no adjustment for multiple comparisons.
The project had a time frame of about 2 years, which made it realistic to include about 500 patients. This sample size was considered acceptable in light of the results from two (unpublished) pilot studies (retrospective, n=20, and prospective, n=29), undertaken before this study to evaluate the feasibility of the Neonatal Morbidity score. The prospective pilot was also undertaken to evaluate the single chart and the practical feasibility of the study as part of the methodologic quality assurance. Both pilot studies found a mean difference of about 1.5 (with SD approximately 4) on the Neonatal Morbidity scale in favor of induction, which we found to be clinically relevant. It corresponds to a standardized difference (mean difference/SD) of .38. Detecting a standardized difference of .3 with a power of 80% at a two-tailed significance level of 5% requires 176 subjects in each group. The final sample size of 254 in each group enables the detection of a standardized difference of .25 (ie, approximately 1.0 on the Neonatal Morbidity scale) and an absolute difference of 10% regarding operative deliveries.15
A total of 614 women were assessed for the study; of these, 604 met eligibility criteria, and 508 agreed to participate (Fig. 1). Ten women were not eligible because they did not speak Norwegian fluently. Forty-seven women declined participation, and 49 were excluded because of logistic issues, uncontactable (n=30), randomization office closed (n=6), no capacity for inclusion at the labor ward (n=8), long distance to hospital (n=3), and two women who had false contractions. A total of 254 women were assigned to induction of labor, 36 of whom went into spontaneous labor before admission for induction occurred. Of the 254 women assigned to continued antenatal monitoring, 59 were induced due to medical reasons (46 oligohydramnios, seven prelabor rupture of membranes, two preeclampsia, two social indications, two abnormal cardiotocogram), and 19 women were induced according to the protocol at gestational day 300. The gestational age at delivery for women assigned to continued antenatal monitoring is listed in Table 2.
Baseline demographic and clinical characteristics for the two groups are presented in Table 3. There were no differences between the groups with respect to ethnicity, maternal age, maternal body mass index, parity, gestational age, or estimated fetal weight. The medical interventions used to induce labor are shown in Table 4. In some cases more than one intervention was involved.
No participants were lost to follow-up. Although there were no stillbirths among women who consented to participate in the trial, there was one death to an eligible woman who had declined to participate because she wanted a natural course of pregnancy. In this case, clinical review at 295 days of gestation was unremarkable, but she attended 4 days later with an intrauterine death. The postmortem examination suggested death was due to antenatal asphyxia. There was one neonatal death of a neonate assigned to antenatal monitoring who had had a normal postterm assessment both at the time of recruitment as well as 3 days later. Spontaneous labor led to admission at 294 days of gestation, and during labor the cardiotocogram that initially had been normal deteriorated, with a prolonged bradycardia. An emergency cesarean delivery was performed within 15 minutes, and Apgar scores at 1, 5, and 10 minutes were 1, 3, and 4; umbilical arterial pH was 7.14, and base excess was –5.5. Clinical examination postnatally together with appropriate investigations demonstrated very abnormal brain function and active care was withdrawn, resulting in the death of the neonate at 2 days of age. Postmortem confirmed that the cause of death was birth asphyxia secondary to a true knot in the umbilical cord.
Neonatal outcome was essentially similar in the study groups (Table 1). There was a trend toward more frequent meconium-stained amniotic fluid during labor in women who had antenatal surveillance (n=87 compared with 69, P=.08). Although there were no neonates less than 2,500 g in either group, mean birthweight tended to be higher in the monitored group (4,032 g and 3,964 g, respectively, P=.09), and the mean length was 0.4 cm longer (52.1 compared with 51.7, P<.01). A total of 30 neonates (6%) were admitted to the NICU, but there was no significant difference in the rate of admission between the two groups. Apgar scores, umbilical blood pH, and base excess were similar in the two groups, as were medical problems. The mean Neonatal Morbidity score in the induction group was 8.85, compared with 9.10 in the monitoring group (SD 4.8; P=.56; 95% confidence interval 1.10–0.59). The upper 2- percentile of the Canadian Multicenter Post-term Pregnancy morbidity index was exceeded by four patients in the induction group and seven patients in the monitored group (P=.38). No neonates had major congenital anomalies.
There was no difference between the study groups in the rate of cesarean delivery (28 and 33 in the induction and monitoring group, respectively, P=.50) or operative vaginal delivery (32 compared with 27, P=.49) (Table 5). Eighteen (7.1%) of the women in the induction group had oligohydramnios at the time of randomization compared with 45 (17.7%) who developed this during serial monitoring (P<.01; number needed to treat was 10). The induced group included more precipitate labors (33 compared with 12 cases, respectively; P<.01; number needed to treat was 13), more short active second stage labors (94 compared with 56 cases, respectively; P<.01; number needed to treat was 7) and fewer long active second stage labors (45 compared with 71 cases respectively; P<.01; number needed to treat was 10) (Table 5). There were no other significant differences between the two groups.
The statistical analyses ware performed on an intention-to-treat basis, but secondary analyses limited to those cases that followed the management directed by randomization gave similar results. Secondary analyses of cases induced for medical reasons (combined from both study groups) found a higher prevalence of meconium-stained amniotic fluid (39%), cesarean delivery (24%), and operative vaginal delivery (13%) compared with the whole population.
This study has shown that policies of immediate induction of labor or serial antenatal monitoring while awaiting spontaneous labor produce no significant difference in the neonatal outcome or mode of delivery of postterm pregnancies. Women who were induced were more likely to have a precipitate labor with a shorter active second stage, although these factors did not alter neonatal outcome.
The study of postterm pregnancies is complicated by the fact that both the normal duration of pregnancy and the best method to define the estimated delivery of pregnancy remain controversial.16 In Norway, the normal duration of pregnancy is defined as 282 days, in line with several studies examining the mean and mode of pregnancy duration.17,18 Similarly, ultrasonography has been shown to be the method of choice for defining the estimated date of delivery (where equipment and trained personnel are available) and is recognized to reduce the number of pregnancies defined as being postterm.19–21 In Norway, pregnancies are routinely dated by ultrasonography at the 18-week scan, and we defined the point for investigating the postmature pregnancy as being one complete week beyond the normal duration of pregnancy, ie, 289 days.
Several studies have demonstrated that the risk of stillbirth and perinatal mortality increase beyond 41 weeks of gestation.2,3,22,23 Despite this, the absolute mortality rate remains quite low, and it would be necessary to complete at least 500 inductions at 41 weeks to prevent one neonatal death.5 Consequently, the maternal and fetal morbidity related to induction of labor are potentially important issues and were the subject of investigation here. This study was not designed to examine differences in mortality. Indeed, there were no fetal deaths in the study, and the only neonatal death related to birth asphyxia was secondary to a true knot in the cord. This death would probably not have been avoided by induction a few days earlier.
In the process of designing this study we reviewed the methodology used by other investigators to define neonatal morbidity. We found most scoring systems unsuitable for a quantitative comparison of the study groups,24 among other things because they focus on premature infants, NICU admission, and mortality. The Neonatal Morbidity score was based on suboptimal outcomes which have been described as being associated with postterm delivery. This scoring system was tested in two pilot studies before the randomized trial. To quantify neonatal outcome we constructed the “perfect infant” and defined multiple criteria to assess the deviation from this (Table 1). The weight ascribed to various criteria could be debated, but the fundamental design of the trial—including randomization and a priori definitions—ensures a valid group comparison. The morbidity index used by Hannah et al9 was also calculated, but the index is difficult to interpret clinically and provided no further information on neonatal morbidity.
There was no difference between the study groups in the rate of operative vaginal delivery, which is in accordance with the Canadian Multicenter Post-term Pregnancy trial.9 The Canadian Multicenter Post-term Pregnancy trial concluded that routine induction of labor at 41 weeks of gestation resulted in a lower rate of cesarean delivery, but we were unable to confirm this.9 In the Canadian Multicenter Post-term Pregnancy trial, prostaglandin E2 gel was used for cervical ripening in the induction group, whereas the preferred induction method was amniotomy or oxytocin in the monitored group. It has been argued that the difference in the treatment protocol was reflected in different cesarean delivery rates because an unripe cervix is a risk factor for cesarean delivery.9,25,26 In the present study, the methods of induction were similar in both groups. The lack of a difference in cesarean rates may be due to a lower prevalence of cesarean delivery in Norway (11–13%) than that described in the hospitals in the Canadian Multicenter Post-term Pregnancy trial (21–25%). However, the results of the present study argues against a belief that the observed increase in cesarean delivery rates in many developed countries is related to a change in obstetric practice toward more aggressive induction policies of postterm pregnancies.
A limitation of the study is that group assignment could not be concealed. Thus the obstetricians’ personal opinions on postterm pregnancies may have influenced treatment. Further, one third of the monitored women were eventually induced for a medical indication. A lack of serial antenatal surveillance may have led to poorer outcomes. Induction for medical reasons was associated with increased intervention, but the label of a “high-risk” pregnancy may have influenced obstetric decision making. Similarly, the recognition of risk factors such as meconium-stained liquor resulted in the use of continues electronic fetal monitoring rather than an intermittent cardiotocogram, and this may also have affected intervention rates.27
In conclusion, there was no difference in the neonatal outcome or mode of delivery for postterm pregnancies managed either by immediate induction of labor or expectantly with serial antenatal surveillance. The outcomes were generally good, and neonatal morbidity, cesarean, and operative vaginal delivery rates were low. If pregnancy is uncomplicated and continued surveillance is possible, women’s own wishes may guide the decision to induce or monitor a pregnancy beyond 41 weeks.
1. Campbell MK, Ostbye T, Irgens LM. Post-term birth: risk factors and outcomes in a 10-year cohort of Norwegian births. Obstet Gynecol 1997;89:543–8.
2. Ingemarsson I, Källén K. Stillbirths and rate of neonatal deaths in 76,761 postterm pregnancies in Sweden, 1982–1991: a register study. Acta Obstet Gynecol Scand 1997;76:658–62.
3. Hilder L, Costeloe K, Thilaganathan B. Prolonged pregnancy: evaluating gestation-specific risks of fetal and infant mortality. Br J Obstet Gynaecol 1998;105:169–73.
4. Olesen AW, Westergaard JG, Olsen J. Perinatal and maternal complications related to postterm delivery: a national register-based study, 1978-1993. Am J Obstet Gynecol 2003;189:222–7.
5. Crowley P. Interventions for preventing or improving the outcome of delivery at or beyond term. Cochrane Database Syst Rev 2000;2:CD000170.
6. Saunders N, Paterson C. Effect of gestational age on obstetric performance: when is “term” over? Lancet 1991;338:1190–2.
7. Heffner LJ, Elkin E, Fretts RC. Impact of labor induction, gestational age, and maternal age on cesarean delivery rates. Obstet Gynecol 2003;102:287–93.
8. Heimstad R, Romundstad PR, Eik-Nes SH, Salvesen KA. Outcomes of pregnancy beyond 37 weeks of gestation. Obstet Gynecol 2006;108:500–8.
9. Hannah ME, Hannah WJ, Hellmann J, Hewson S, Milner R, Willan A. Induction of labor as compared with serial antenatal monitoring in post-term pregnancy. A randomized controlled trial. The Canadian Multicenter Post-term Pregnancy Trial Group. N Engl J Med 1992;326:1587–92.
10. Yogev Y, Ben-Haroush A, Gilboa Y, Chen R, Kaplan B, Hod M. Induction of labor with vaginal prostaglandin E2. J Matern Fetal Neonatal Med 2003;14:30–4.
11. Leung WC, Lao TT. Routine induction of labour at 41 weeks of gestation: nonsensus consensus. BJOG 2002;109:1416–7.
12. Odegård RA, Vatten LJ, Nilsen ST, Salvesen KA, Austgulen R. Umbilical cord plasma leptin is increased in preeclampsia. Am J Obstet Gynecol 2002;186:427–32.
13. Hollander M, Wolfe DA. Two-sample location problem. In: Hollander M, Wolfe DA, editors. Nonparametrical statistical methods. New York (NY): Wiley; 1999.
14. R Development Core Team. A language and environment for statistical computing. Vienna (Austria): R Foundation for Statistical Computing; 2006.
15. Machin D, Campbell MJ, Fayers PM, Pinol AP, editors. Sample size tables for clinical studies. Oxford (UK): Blackwell Science; 1997.
16. Nguyen TH, Larsen T, Engholm G, Møller H. Evaluation of ultrasound-estimated date of delivery in 17,450 spontaneous singleton births: do we need to modify Naegele’s rule? Ultrasound Obstet Gynecol 1999;14:23–8.
17. Tunón K, Eik-Nes SH, Grøttum P. A comparison between ultrasound and a reliable last menstrual period as predictors of the day of delivery in 15,000 examinations. Ultrasound Obstet Gynecol 1996;8:178–85.
18. Smith GC. Use of time to event analysis to estimate the normal duration of human pregnancy. Hum Reprod 2001;16:1497–500.
19. Gardosi J, Geirsson RT. Routine ultrasound is the method of choice for dating pregnancy. Br J Obstet Gynaecol 1998;105:933–6.
20. Eik-Nes SH, Salvesen KA, Okland O, Vatten LJ. Routine ultrasound fetal examination in pregnancy: the “Alesund” randomized controlled trial. Ultrasound Obstet Gynecol 2000;15:473–8.
21. Gardosi J, Vanner T, Francis A. Gestational age and induction of labour for prolonged pregnancy. Br J Obstet Gynaecol 1997;104:792–7.
22. Divon MY, Haglund B, Nisell H, Otterblad PO, Westgren M. Fetal and neonatal mortality in the postterm pregnancy: the impact of gestational age and fetal growth restriction. Am J Obstet Gynecol 1998;178:726–31.
23. Smith GC. Life-table analysis of the risk of perinatal death at term and post term in singleton pregnancies. Am J Obstet Gynecol 2001;184:489–96.
24. Fleisher BE, Murthy L, Lee S, Constantinou JC, Benitz WE, Stevenson DK. Neonatal severity of illness scoring systems: a comparison. Clin Pediatr (Phila) 1997;36:223–7.
25. Alexander JM, MCIntire DD, Leveno KJ. Prolonged pregnancy: induction of labor and cesarean births. Obstet Gynecol 2001;97:911–5.
26. Shin KS, Brubaker KL, Ackerson LM. Risk of cesarean delivery in nulliparous women at greater than 41 weeks’ gestational age with an unengaged vertex. Am J Obstet Gynecol 2004;190:129–34.
© 2007 by The American College of Obstetricians and Gynecologists. Published by Wolters Kluwer Health, Inc. All rights reserved.
27. Vintzileos AM, Nochimson DJ, Guzman ER, Knuppel RA, Lake M, Schifrin BS. Intrapartum electronic fetal heart rate monitoring versus intermittent auscultation: a meta-analysis. Obstet Gynecol 1995;85:149–55.