Click on the links below to access all the Data Supplements for this article.
Please note that Data Supplement files may launch a viewer application outside of your web browser.
Prematurity is the major contributor to perinatal morbidity and mortality in the United States. Therefore, identifying women at risk for preterm delivery remains an issue of paramount importance. Recent investigators have described an inverse relationship between cervical length by endovaginal ultrasonography and the likelihood of subsequent preterm birth.1–4 Digital cervical examination may also provide significant clinical information, while avoiding the cost and logistical difficulties of serial transvaginal ultrasonography. We sought to estimate the best use of the information obtained by digital examination in terms of predicting preterm delivery risk.
Antepartum digital cervical examination traditionally has been performed using the Bishop score calculated by assessment of dilatation, effacement, consistency of the cervix, its position, and the station of the presenting part.5 The cervical score described by Houlton in 1982 attempted to refine the information available from the digital cervical examination by replacing effacement with length as a descriptor of the unlabored cervix and ignoring the more subjective parameters of consistency, position, and station.6 The cervical score places a greater emphasis on cervical length, whereas cervical effacement is only one of five components of the Bishop score, which was originally developed as an evaluation of the inducibility of the cervix at term rather than as a predictor of preterm birth.
The objective of this investigation was to compare the ability of these two digital cervical assessment scores to predict spontaneous preterm delivery in a large low-risk singleton population. It is hypothesized that the cervical score would have superior predictive capability for spontaneous preterm delivery.
MATERIALS AND METHODS
A prospective cohort study was conducted among 2,916 women with singleton pregnancies enrolled in a multicenter (10 sites) preterm prediction study between October 1992 and July 1994 by the National Institutes of Child Health and Human Development Maternal-Fetal Medicine Units Network. A secondary analysis was performed of data available from that study. The investigation was approved by the human subjects review board at each institution. All participating women provided written informed consent. Inclusion criteria were singleton gestations with intact membranes enrolled between 22 and 24 weeks of gestation. All women underwent an ultrasound examination before enrollment to confirm the last menstrual period dating criteria or to establish the duration of gestation if those criteria could not be confirmed. Race and parity distributions reflected each participating center with no single center contributing more than 20% of the total study population. Exclusion criteria were those pregnancies with fetal demise, congenital malformation, placenta previa, multifetal gestation, cervical cerclage, human immunodeficiency virus positivity, preterm labor, preterm premature rupture of the membranes, prolapse of the fetal membranes, or plans to deliver away from the clinical center. The sample size was calculated assuming a risk of premature delivery before 35 weeks of gestation of 3.5%, that at least 5% of the women would have positive results on any given screening test, and that the odds ratio for premature delivery was 2.0 or more for women with positive results on that screening test as compared with women with negative results. A sample of 3,000 women was chosen to give a lower 95% confidence interval limit greater than 1 for this odds ratio.
All 2,916 enrolled patients underwent a second-trimester digital cervical examination at 22–24 weeks of gestation. Of these, 2,538 underwent repeat digital cervical examination in the early third trimester between 26 weeks and 29 weeks. The digital examinations were performed by designated study personnel at each clinical site. The study personnel were trained and certified according to study protocol. Before initiation of the study, each center designated one examiner to be the “standard” to which all cervical examiners were compared. This person was usually the institutional principal investigator, but it had to be an individual with at least 5 years experience in cervical examinations. For a period of time the “standard” examiner and study nurse assessed cervical dilation, length, station, consistency, and position together to establish consistent agreement between both examiners. At that point, 10 patients were examined independently, and the detailed results were recorded on the Cervical Examination Standardization Form by each examiner. These forms were collected and mailed to the George Washington Biostatistics Coordinating Center for certification. The examinations were compared for consistency between the examiners using prespecified criteria for agreement. If the results of the paired examinations were not consistent, the study nurse was required to complete more cervical examinations in conjunction with the “standard” examiner. During the study, approximately every 12 weeks, a sample of patients were randomly chosen by the George Washington Biostatistics Coordinating Center for verification of continuing consistency as for the initial certification. All study personnel conducting cervical examinations had to be both initially and continuously certified. A manual of operations was developed describing the technique for evaluating each element of the Bishop score. The recorded findings from each cervical examination were used for calculation of both the Bishop score and cervical score centrally by the George Washington Biostatistics Coordinating Center (Table 1).5,6
The relationship between the Bishop score, cervical score, and spontaneous preterm delivery less than 37, less than 35, and less than 32 weeks of gestation was assessed using multivariable logistic regression analysis, adjusting for body mass index, African-American race, and previous preterm delivery to estimate odds ratios and 95% confidence intervals at both assessments (22–24 weeks and 26–29 weeks). Further adjustment for clinical center did not change the results. Spontaneous preterm delivery less than 35 weeks of gestation was selected as the primary outcome due to its relative frequency and the increased risk of neonatal morbidity with delivery earlier than this gestational age. Spontaneous preterm delivery was defined as those deliveries after spontaneous onset of preterm labor or preterm premature rupture of membranes.
Receiver operating characteristic (ROC) curves were created for both Bishop score and cervical score at both assessments. Appropriate cut points or diagnostic thresholds for both scores at each assessment were identified by visual inspection. Sensitivity, specificity, false-positive rate (ie, 1–specificity), and positive and negative predictive values for tests using these thresholds were estimated. A comparison of the ability of each test to classify patients correctly according to whether they experienced spontaneous preterm delivery less than 35 weeks of gestation was assessed using McNemar test for the chosen cut points and in comparison with transvaginal ultrasonographic cervical assessments previously performed.1 To examine the overall performance of Bishop score and cervical score for the entire range of cut points, the areas under the ROC curves for Bishop score and cervical score and the 95% confidence intervals were calculated at each assessment and compared.7 For all tests, a nominal two-tailed P<.05 was considered significant. No adjustments were made for multiple comparisons.
One hundred twenty-seven of the 2,916 enrolled patients (4.4%) undergoing digital cervical examination at 22–24 weeks had a spontaneous preterm delivery at less than 35 weeks of gestation. The demographic characteristics of this entire cohort have been previously described.1,3 Baseline characteristics for those women who did (n=127) and did not (n=2,789) experience spontaneous preterm delivery at less than 35 weeks of gestation are compared in Table 2.
Of the original 2,916 patients, 2,538 were reexamined at 26–29 weeks. Of these, 84 had a spontaneous preterm delivery at less than 35 weeks of gestation. The Bishop score and the cervical score were both significantly associated with spontaneous preterm delivery using an adjusted multivariable logistic regression analysis (Table 3). An increase in odds of spontaneous preterm delivery at less than 37, less than 35, and less than 32 weeks of gestation was observed per unit increase in the Bishop score at both the 22–24 week examination and the 26–29 week reexamination (Table 3). A decreased odds of spontaneous preterm delivery at less than 37, less than 35, and less than 32 weeks of gestation was observed per unit increase in the cervical score at both examinations (Table 3). These odds ratios are not directly comparable because the units for the two scores are not the same.
As the cut point for the Bishop score increases and as the cut point for the cervical score decreases, the specificity, positive predictive values, and odds ratios for spontaneous preterm delivery at less than 35 weeks of gestation increase; however, the sensitivity quickly declines for both tests. It is noted that a very small proportion (less than 10%) of the 127 patients with a spontaneous preterm delivery at less than 35 weeks had a Bishop score of 6 or more or a cervical score less than 1.0 between 22 and 24 weeks of gestation (Table 4). At 26–29 weeks of gestation, similar results were observed (Table 5). The areas under the ROC curves ranged between 0.61 and 0.68 (random assignment would have an area of 0.5). However, all adjusted odds ratios for spontaneous preterm delivery at less than 35 weeks of gestation exceeded 2.9 for abnormal examinations defined by Bishop score between 4 and 7 for 22–24 weeks and 5–7 for 26–29 weeks or cervical score between 0 and 1.5 for both 22–24 and 26–29 weeks of gestation.
Inspection of the ROC curve inflection points suggested that the optimal diagnostic thresholds for spontaneous preterm delivery less than 35 weeks at the 22–24 week examination were 4 or more for the Bishop score and less than 1.5 for the cervical score. The ROC curves for the 22–24 week examination are shown in Figure 1. At 22–24 weeks of gestation, the area under the ROC curve was significantly larger for the Bishop score than for the cervical score (0.66 compared with 0.61, P=.03), indicating that overall, Bishop score is a better diagnostic test at this gestation. However, McNemar test revealed that a cervical score less than 1.5 was superior to a Bishop score of 4 or more (P<.001) for the prediction of spontaneous preterm delivery at less than 35 weeks of gestation. For 86.6% of the patients, the timing of delivery was correctly predicted by both tests and for 4.4% of the patients both tests were wrong. For 7.4% of the patients the cervical score was correct whereas the Bishop score was wrong, and for 1.6% of the patients the Bishop score was correct whereas the cervical score was wrong.
At the 26–29 week reexamination, the chosen diagnostic thresholds for spontaneous preterm delivery less than 35 weeks were 5 or more for the Bishop score and less than 1.5 for the cervical score. The ROC curves for the 26–29 week examination are shown in Figure 2. The area under the ROC curves were not statistically significantly different for the Bishop score and the cervical score (both 0.68, P=.90); however, McNemar test again revealed that a cervical score less than 1.5 was superior to a Bishop score 5 or more at 26–29 weeks of gestation for prediction of spontaneous preterm delivery less than 35 weeks (P<.001). For 89.0 % of the patients, the timing of delivery was correctly predicted by both tests and for 4.8% of the patients both tests were wrong. For 4.3% of the patients the cervical score was correct whereas the Bishop score was wrong, and for 2.0% of the patients the Bishop score was correct whereas the cervical score was wrong.
Ultrasonographic cervical length of 20 mm or less, funneling of the endocervical canal, and cervical score less than 1.5 at 26 – 29 weeks of gestation were compared as predictive tests for spontaneous preterm delivery less than 35 weeks (Table 6). McNemar test revealed superiority at 26–29 weeks for predicting spontaneous preterm delivery less than 35 weeks for a cervical score less than 1.5 compared with whether or not funneling was present on endovaginal ultrasound (P<.001). McNemar test revealed no test superiority when comparing a cervical score less than 1.5 and a cervical length 20 mm or less measured by ultrasonography. However, when the cervical score less than 1.5 and the cervical length 20 mm or less are compared in Table 6, it is noted that given the same specificity, the cervical score had a slightly higher sensitivity.
Although digital cervical examination has been proposed by some as a routine method of assessing preterm delivery risk,8–10 it has not been generally accepted for this role in the United States. Many consider the components of digital cervical examination to be unacceptably subjective with high intraobserver and interobserver variability.11,12 Additionally, digital evaluation is limited to the vaginal portion of the uterine cervix only. With the development of endovaginal cervical ultrasonography, it has been documented that digital examination underestimates the true cervical length by 10–20 mm.12–14 The supravaginal portion of the cervix immediately adjacent to the internal cervical os seems to be the part of the cervix that undergoes the earliest prelabor cervical change.1–3,15 Regardless of the clinician’s experience, it is possible that effacement of the cervix beginning in the region of the internal cervical os may not be detectable by digital palpation until significant change has already occurred.
Although it is true the supravaginal portion of the cervix and the internal cervical os can be assessed more directly by the ultrasound probe than by the examining finger, it does not necessarily follow that digital examination is therefore an inferior predictor of preterm delivery risk. Measures of test function for a digital examination obtained at 26–29 weeks of gestation are very similar to those obtained from the same patients with endovaginal ultrasonographic cervical length measurements of 20 mm or less or the presence of endocervical funneling (Table 6). It has been previously published that endovaginal ultrasonographic cervical length measurements correlate significantly (P<.001 by the Jonckheere–Terpstra test) with the Bishop score.1 Subjects with a Bishop score of 6 or more at 24 or 28 weeks of gestation had mean cervical lengths of 23.1±13.3 mm and 25.4±11.2 mm respectively.1 Differences in predictive capability that exist between endovaginal ultrasonography and digital cervical examination are small and may not justify the expense of endovaginal ultrasonography in low-risk obstetric populations, especially after 24 weeks of gestation. As a result, it becomes important to identify the most effective use of the information obtained by digital examination. In this investigation, although Bishop score was an overall better test than cervical score at 22–24 weeks, a cervical score of less than 1.5 in the early third trimester (26–29 weeks) was a better test for spontaneous preterm delivery <35 weeks of gestation compared with a Bishop score of 5 or more.
The superiority of the cervical score cutoff at 26–29 weeks is most likely a function of its focus on cervical length and dilatation as opposed to the Bishop score, which includes somewhat more subjective parameters, such as cervical consistency, position, and station of the presenting part. It is important to remember that the Bishop score was developed to estimate the inducibility of the cervix at term and not as a predictor of preterm delivery risk.5 When the components of the Bishop score are evaluated individually, dilatation and effacement have been found to be the two best predictors of both inducibility and gestational age at delivery.16,17 The cervical score, on the other hand, was developed specifically to assess preterm delivery risk6 and has been evaluated primarily in high-risk gestations. In multiple gestations, it was demonstrated that as the cervical score decreases, the mean time until delivery shortens.18,19 A cervical score of 0 or less between 20 and 28 weeks of gestation predicted a 92% risk of preterm delivery.19 It must be cautioned, however, that test functions for either the cervical score or the Bishop score will be significantly different in the multifetal compared with low-risk singleton populations due to their much higher prevalence of premature delivery.
A possible limitation of this investigation does relate to the risk status of this population. Given the demographics of the participating centers, the cohort is of relatively poor socioeconomic status, with a high percentage of African-American women. Almost 16% also had a history of prior spontaneous preterm delivery. This cohort may be at higher baseline risk than a private practice, community-based population. These population differences would be expected to increase the test’s positive predictive value. Another limitation may be that the predictive values of both the cervical score and the Bishop score were estimated based on routine scheduled examinations.3 The clinical use of the digital cervical examination will usually be in symptomatic women. The anticipated greater risk of women undergoing indicated clinical examination may alter the predictive qualities of this test.
In this singleton population, neither cervical score nor Bishop score can be considered a strong diagnostic test. It is conventional wisdom that to be a useful test, the area under the ROC curve should equal or exceed 0.75—this level was not achieved by either score at either assessment. If the observed results held true, then among 1,000 patients, a cervical score of less than 1.5 compared with a Bishop score of 5 or more at 26–29 weeks would have correctly predicted a spontaneous preterm delivery in only one additional patient, although the cervical score would have correctly provided a reassuring result for about 20 additional women. The primary advantage of the cervical score is its relatively superior specificity, which allows for greater reassurance when the examination is normal. This difference, as well as the overall validity testing, is not compelling enough to justify routine digital examination of low risk patients to determine a cervical score. However, cervical examinations are frequently performed in response to subjective maternal signs and symptoms. When a digital examination is performed for whatever reason, its predictive capability should be maximized. The cervical score provides the most information from that examination and, as an added benefit, is also an easier way to quantify cervical status.
A cervical score of less than 1.5 between 26–29 weeks of gestation would identify a very small portion of this population (less than 6%) as being at risk for early preterm delivery. One of five of these women would deliver before 35 weeks of gestation. Cervical scores that are even lower are a rare finding in the early third trimester but are associated with significantly higher positive predictive values. Although we are limited in our ability to prevent preterm labor, identification of women “at risk” is a prerequisite to any intervention. Identification of at-risk women early in the third trimester may allow the opportunity to employ interventions, such as increased surveillance for preterm labor and consideration for administration of antenatal corticosteroids. A low cervical score may also identify an “at risk” group of women who might benefit from weekly injections of 17 alpha-hydroxy progesterone caproate.20 Consideration should be given to using a cervical score of less than 1.5 in the mid or early third trimester as an inclusion criteria to study 17 alpha-hydroxy progesterone caproate and/or other interventions designed to reduce the risk of preterm birth.
1. Iams JD, Goldenberg RL, Meis PJ, Mercer BM, Moawad A, Das A, et al. The length of the cervix and the risk of spontaneous premature delivery. National Institute of Child Health and Human Development Maternal Fetal Medicine Unit Network. N Engl J Med 1996;334:567–72.
2. Andersen HF, Nugent CE, Wanty SD, Hayashi RH. Prediction of risk for preterm delivery by ultrasonographic measurement of the cervical length. Am J Obstet Gynecol 1990;163:859–67.
3. Goldenberg RL, Iams JD, Mercer BM, Meis PJ, Moawad AH, Copper RL, et al. The preterm prediction study: the value of new vs standard risk factors in predicting early and all spontaneous preterm births. NICHD MFMU Network. Am J Public Health 1998;88:233–8.
4. Imseis HM, Albert TA, Iams JD. Identifying twin gestations at low risk for preterm birth with a transvaginal ultrasonographic cervical measurement at 24 to 26 weeks’ gestation. Am J Obstet Gynecol 1997;177:1149–55.
5. Bishop EH. Pelvic scoring for elective induction. Obstet Gynecol 1964;24:266–8.
6. Houlton MC, Marivate M, Philpott RH. Factors associated with preterm labour and changes in the cervix before labour in twin pregnancy. Br J Obstet Gynaecol 1982;89:190–4.
7. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988;44:837–45.
8. Wood C, Bannerman RH, Booth RT, Pinkerton JH. The prediction of premature labor by observation of the cervix and external tocography. Am J Obstet Gynecol 1965;91:396–402.
9. Papiernik E, Bouyer J, Collin D, Winisdoerffer G, Dreyfus J. Precocious cervical ripening and preterm labor. Obstet Gynecol 1986;67:238–42.
10. Stubbs TM, Van Dorsten JP, Miller MC 3rd. The preterm cervix and preterm labor: relative risks, predictive values, and change over time. Am J Obstet Gynecol 1986;155:829–34.
11. Holcomb WL Jr, Smeltzer JS. Cervical effacement: variation and belief among clinicians. Obstet Gynecol 1991;78:43–5.
12. Jackson GM, Ludmir J, Bader TJ. The accuracy of digital examination and ultrasound in the evaluation of cervical length. Obstet Gynecol 1992;79:214–8.
13. Sonek JD, Iams JD, Blumenfeld M, Johnson F, Landon M, Gabbe S. Measurement of cervical length in pregnancy: comparison between vaginal ultrasonography and digital examination. Obstet Gynecol 1990;76:172–5.
14. Goldberg J, Newman RB, Rust PF. Interobserver reliability of digital and endovaginal ultrasonographic cervical length measurements. Am J Obstet Gynecol 1997;177:853–8.
15. Iams JD, Johnson FF, Sonek J, Sachs L, Gebauer C, Samuels P. Cervical competence as a continuum: a study of ultrasonographic cervical length and obstetric performance. Am J Obstet Gynecol 1995;172:1097–103.
16. Bouyer J, Papiernik E, Dreyfus J, Collin D, Winisdoerffer B, Gueguen S. Maturation signs of the cervix and the prediction of preterm birth. Obstet Gynecol 1986;68:209–14.
17. Lange AP, Secher NJ, Westergaard JG, Skovgard I. Prelabor evaluation of inducibility. Obstet Gynecol 1982;60:137–47.
18. Neilson JP, Verkuyl DA, Crowther CA, Bannerman C. Preterm labor in twin pregnancies: prediction by cervical assessment. Obstet Gynecol 1988;72:719–23.
19. Newman RB, Godsey RK, Ellings JM, Campbell BA, Eller DP, Miller MC 3rd. Quantification of cervical change: relationship to preterm delivery in the multifetal gestation. Am J Obstet Gynecol 1991;165:264–9.
20. Meis PJ, Klebanoff M, Thom E, Dombrowski MP, Sibai B, Moawad AH, et al. Prevention of recurrent preterm delivery by 17 alpha-hydroxy progesterone caproate. N Engl J Med 2003;348:2379–85.
Supplemental Digital Content
© 2008 by The American College of Obstetricians and Gynecologists. Published by Wolters Kluwer Health, Inc. All rights reserved.