Detection of fetuses who will have birth weights of at least 4000 g is important because excessive growth is associated with prolonged labor,1 operative or traumatic delivery, and fetal neurologic injury.2 The two most common methods of predicting birth weight are clinical estimation and sonography.3,4 Clinicians can estimate fetal weight by palpating the abdomen, measuring the fundal height, and integrating their personal experience with women's obstetric histories. Ultrasound biometric measurements and regression equations can determine fetal weight objectively.5,6 In part because of the limited ability of those traditional techniques to identify fetuses above the threshold of 4000 g,7 there are five new sonographic methods to detect excessive fetal growth: cheek-to-cheek diameter,8 fetal thigh subcutaneous tissue at the level of femoral diaphysis, thigh soft tissue/femur length (FL) ratio,9 upper arm soft tissue thickness, and estimated fetal weight (EFW) derived from a formula that incorporates abdominal circumference (AC), FL and upper arm soft thickness.10 The limitations of the studies8–10 include not comparing the two traditional techniques with the newer models and not using a receiver operating characteristic (ROC) curve to assess a diagnostic test.
The purpose of this prospective study was to compare diagnostic accuracy of five new sonographic techniques with antenatal clinical and sonographic weight estimates to differentiate between newborns who weighed less than or at least 4000 g.
Materials and Methods
After institutional review board approval, gravidas after 36 weeks' gestation who were suspected of having macrosomic fetuses were invited to participate. Suspicions of excessive growth were based on discrepancies between uterine fundal height and gestational age, gestational or pregestational diabetes mellitus, and prior delivery of a newborn who weighed 4000 g or more. Subjects were recruited from the resident clinic and not referred for any particular indications. Each woman had an accurate determination of gestational age based on recollection of regular menstrual periods, and a sonographic examination by 20 weeks. The exclusion criteria were multiple gestations and known anomalies.
Once a clinical estimate was recorded, biparietal diameter (BPD), head circumference (HC), AC, and FL were measured. Cheek-to-cheek diameter was measured in the coronal view of the fetal face at the level of the nostril and lips.8 Upper arm soft tissue thickness was taken by identifying the upper arm in a longitudinal view, rotating the transducer 90°, and moving it cephalad until the head of the humerus was seen. Measurement was from below the humeral head from the outer edge of the bone to the skin surface.10 Thigh subcutaneous tissue was ascertained at the level of fetal femoral diaphysis, as described by Santolaya-Forgas et al.9 Obstetrics-gynecology residents or maternal-fetal medicine subspecialists made clinical predictions of fetal weight, and two sonographers, each with more than 6 years of training, did the ultrasonographic examinations. Physicians who predicted birth weights reviewed the obstetric histories and used Leopold maneuvers to determine weight on day of examination.
At the completion of the study, clinical predictions were converted to grams. Predicting fetal weight with the use of biometric measurements was from the formula proposed by Combs et al,6 and it was not calculated until all sonographic measurements were made. Fetal weight also was estimated by regression equation using AC, FL, and upper arm soft tissue thickness.10 Seven ROC curves were constructed for clinically and sonographically estimated birth weights, and the five new techniques to differentiate between the abnormal (at least 4000 g) and normal (less than 4000 g) birth weights. Two-by-two contingency tables were created to calculate true- and false-positive rates for diagnostic tests to differentiate between macrosomic and nonmacrosomic newborns. We plotted the true-positive rate against the false-positive rate for an estimate to predict the adverse outcome to develop the ROC curves. The areas under the curves and the standard error (SE) were estimated by a point-to-point trapezoidal method of integration. The critical ratio z test for a paired statistical design was used to determine the statistical significance of the curves.11 If the area under the ROC curve was not significantly different from the area under the nondiagnostic line (y = x, or when the false-positive and true-positive rates were identical), the diagnostic test was considered poor. P < .05 was considered statistically significant. We used logistic regression to model each birth weight estimate's ability to predict birth weight of 4000 g or more. Statistical analysis was done with STATA (Stata Corp., College Station, TX). The demographic and sonographic data are presented as mean (± standard deviation [SD]).
The mean maternal age (± SD) of our 100 subjects was 28.7 ± 6.0 years. Sixty-seven were black, 28 were white, and five were Hispanic. Forty-eight percent were nulliparas. At recruitment, the mean gestational age was 37.4 ± 1.6 weeks, and at delivery, 39.4 ± 1.3 weeks. The median (range) time from assessment to delivery was 10 days (0–31 days). Twenty-three had gestational or pregestational diabetes. The mean birth weight of newborns was 3791 ± 510 (range 2775–5420) g, with 28% weighing 4000 g or more.
Comparison of newborns with birth weights of at least 4000 g with those less than 4000 g indicated similar gestational ages at enrollment (37.3 ± 1.6 versus 37.4 ± 1.7 weeks; P = .789), and delivery (39.8 ± 1.2 versus 39.3 ± 1.3 weeks; P = .081). The mean BPD (9.3 ± 0.5 versus 9.0 ± 0.5 cm; P = .008), HC (33.7 ± 1.8 versus 32.7 ± 1.3 cm; P = .003), AC (36.0 ± 2.9 versus 33.7 ± 1.8 cm; P < .001), and FL (7.2 ± 0.3 versus 7.0 ± 0.4 cm; P = .019) were significantly different, as were the clinical (3849 ± 412 g versus 3570 ± 369 g; P = .001) and sonographic mean EFWs (3506 ± 585 g versus 3077 ± 348; P < .001).
Of the five new indices studied, three soft tissue measurements were significantly higher in infants who weighed more than 4000 g: cheek-to-cheek diameter (7.1 ± 0.8 versus 6.6 ± 0.7 mm; P = .011), estimate of fetal weight derived from AC, FL, and measurement of upper arm soft tissue (3962 ± 526 versus 3612 ± 359 g; P < .001), and ratio of thigh soft subcutaneous tissue (0.09 ± 0.02 versus 0.08 ± 0.02 mm; P = .027). The upper arm soft tissue thickness (15.9 ± 1.5 versus 15.8 ± 1.9 mm; P = .803) and thigh soft subcutaneous tissue (6.2 ± 1.6 versus 5.8 ± 1.4 mm; P = .221) did not differ statistically significantly in newborns who weighed more than or less than 4000 g.
Seven ROC curves were created to compare newborns who weighed less than with those who weighed at least 4000 g (Table 1). The ROC curve with the highest area was attained by sonographic estimate of birth weight by using HC, AC, and FL, followed by clinical EFW and the lowest area with measurement of subcutaneous tissue surrounding the humerus. Four of the ROC curves (cheek-to-cheek diameter or estimate of birth weight based on Combs et al6 formula, clinical examination, or measurement of soft tissue surrounding the humerus) were significantly higher than the area under the nondiagnostic line (Table 1). The areas under the three remaining ROC curves (upper arm or thigh subcutaneous tissue or thigh subcutaneous tissue/femoral length ratio) were similar to the areas under the nondiagnostic lines. Hence, these three techniques were poor diagnostic tests for identifying macrosomic fetus (Table 1).
Clinical estimate is the simplest technique for predicting birth weight, so the area under its curve was compared with the other ROC curves. The ROC curve derived from clinical examination had a significantly higher area than one obtained from measurement of upper arm subcutaneous tissue (P = .031); otherwise, the areas under the ROC curves were similar (Table 1). Our sample of 100 women provided insufficient power to detect significant differences between areas under some of the ROC curves. On the basis of Hanley and McNeil's12 calculations, we needed 110 abnormal (at least 4000 g) and normal (less than 4000 g) birth weight infants combined, to have 80% power to detect a difference between ROC curves with areas of 0.58 (thigh subcutaneous tissue) and 0.72 (clinical estimate). To detect a difference between cheek-to-cheek diameter (0.67) and clinical estimate, we needed 350 abnormal (4000 g or more) and normal (less than 4000 g) birth weight infants combined for 80% power.
Identifying newborns who weigh 4000 g or more is important because birth of macrosomic fetuses is associated with adverse peripartum outcomes.2 Intuitively, it seems that because newborns with excessive growth have increased adipose tissue, sonographic measurement of subcutaneous tissue would be able to detect macrosomic fetuses.
Abramowicz et al8 reported that the mean cheek-to-cheek diameter was significantly different among those with excessive growth than those without growth abnormality. Santolaya-Forgas et al9 noted that the mean thigh subcutaneous tissue and the thigh subcutaneous tissue/FL ratio were significantly higher among those with birth weights greater than 90% compared with weights between 10% and 90%. Sood et al10 found that the mean upper arm soft tissue thickness was higher among those who weighed at least 4000 g than those who weighed 3999 g or less. They also proposed a new regression equation, using soft tissue surrounding the humerus, to predict birth weight to identify macrosomic fetuses.10
Those three reports did not use ROC curves to establish whether their diagnostic tests could differentiate between normal and abnormal conditions. If the diagnostic test is a continuous variable and the outcome categoric, then an ROC curve permits selection of a diagnostic threshold that can be used to identify adverse outcomes. By calculating the area under the ROC curve and comparing it with the area of the nondiagnostic line, one can determine if the test is better at predicting the outcome than chance alone.11
The designs of the current study and previous reports8–10 of measurements of soft tissue differ; thus, it would seem our results are not comparable. Investigators who proposed cheek-to-cheek diameter8 or thigh subcutaneous tissue/FL ratio9 intended those measurements to identify fetuses that were at least 90% of gestational age; therefore, the failure of those techniques to identify newborns with birth weights of 4000 g or more might not be a surprise. Morbidity with excessive fetal growth occurs when the birth weight is at least 4000 g.13,14
Another explanation of why we could not confirm measurements of soft tissue detecting birth weights greater than 4000 g is the time between assessment of fetuses and delivery. In the current report, the mean time was 14 days, and it was 24 hours or 1 week among studies that evaluated femoral10 or humeral11 soft tissue, respectively. Investigators who proposed cheek-to-cheek diameter8 did not mention intervals. It is possible that during the 2 weeks, growth of soft tissue is different from that of the viscera and skeleton, but the mean lag time was similar among those who weighed at least 4000 g and those who did not. If measurements of the soft tissue were so sensitive that they needed to be repeated daily or weekly, its use would be limited in routine practice. Combining diabetics and nondiabetics in our analysis might be the reason we could not confirm diagnostic utility of soft tissue measurements. The prediction of birth weight derived from clinical examination or sonographic measurements of biometric indices has been useful for detecting macrosomic newborns regardless of the diabetic status,15,16 so it is reasonable to expect the same from measurements of soft tissue.
One of the limitations of this study was that whereas residents and maternal-fetal medicine specialists did clinical predictions of fetal weight, certified sonographers did sonographic examinations. The reason for that design was that house officers and staff have similar accuracy predicting birth weights,17 and registered sonographers have optimal training to consistently measure the soft tissue. The performance of the five new techniques might have been worse if all the examiners in the study did soft tissue measurements. The other limitation was that the study did not provide an explanation of why clinical and sonographic predictions have similar diagnostic accuracy, but measurements of most of the soft tissues do not. It should be noted that investigators7,18 using ROC curves have reported that clinical examination or biometric measurements can identify macrosomic newborns. The potential explanations for the poor performance of some of the soft tissues include a different population, possible ethnic variations in distribution of soft tissues, irreproducible measurements of soft tissue, or inability of those proposed models to identify newborns with weights greater than 4000 g.
1. Turner MJ, Rasmussen MJ, Turner JE, Boylan PC, MacDonald D, Stronge JM. The influence of birth weight on labor in nulliparas. Obstet Gynecol 1990;76:159–63.
2. American College of Obstetricians and Gynecologists. Fetal macrosomia. ACOG technical bulletin no. 159. Washington DC: American College of Obstetricians and Gynecologists, 1991.
3. Chauhan SP, Hendrix NW, Magann EF, Morrison JC, Kenney SP, Devoe LD. Limitations of clinical and sonographic estimates of birth weight: Experience with 1034 parturients. Obstet Gynecol 1998;91:72–7.
4. Sherman DJ, Arieli S, Tovbin J, Siegel G, Caspi E, Bukovsky I. A comparison of clinical and ultrasonic estimation of fetal weight. Obstet Gynecol 1998;91:212–7.
5. Hadlock FP, Harrist RB, Sharman RS, Deter RL, Park SK. Estimation of fetal weight with the use of head, body and femur measurements—A prospective study. Am J Obstet Gynecol 1985;151:333–7.
6. Combs CA, Jaekle RK, Rosenn B, Pope M, Miodovnik M, Siddiqi TA. Sonographic estimation of fetal weight based on a model of fetal volume. Obstet Gynecol 1993;82:365–70.
7. Chauhan SP, Cowan BD, Magann EF, Bradford TH, Roberts WE, Morrison JC. Intrapartum detection of a macrosomic fetus: Clinical versus eight sonographic models. Aust N Z J Obstet Gynaecol 1995;35:266–70.
8. Abramowicz JS, Sherer DM, Bar-Tov E, Woods JR Jr. The cheek-to-cheek diameter in the ultrasonographic assessment of fetal growth. Am J Obstet Gynecol 1991;165:846–52.
9. Santolaya-Forgas J, Meyer WJ, Gauthier DW, Kahn D. Intrapartum fetal subcutaneous tissue/femur length ratio: An ultrasonographic clue to fetal macrosomia. Am J Obstet Gynecol 1994;171:1072–5.
10. Sood AK, Yancey M, Richards D. Prediction of fetal macrosomia using humeral soft tissue thickness. Obstet Gynecol 1995;85:937–40.
11. Beck JR, Shultz EK. The use of relative operating characteristic (ROC) curves in test performance evaluation. Arch Pathol Lab Med 1986;110:13–20.
12. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982;143:29–36.
13. Ecker JL, Greenberg JA, Norwitz ER, Nadel AS, Repke JT. Birth weight as a predictor of brachial plexus injury. Obstet Gynecol 1997;89:643–7.
14. Rouse DJ, Owen J, Goldenberg RL, Cliver SP. The effectiveness and costs of elective cesarean delivery for fetal macrosomia diagnosed by ultrasound. JAMA 1996;276:1480–6.
15. McLaren RA, Puckett JL, Chauhan SP. Estimators of birth weight in pregnant women requiring insulin: A comparison of seven sonographic models. Obstet Gynecol 1995;85:565–9.
16. Hendrix NW, Morrison JC, McLaren RA, Magann EF, Chauhan SP. Clinical and sonographic estimates of birth weight among diabetic parturients. J Matern Fetal Invest 1998;8:17–20.
17. Field NT, Piper JM, Langer O. The effect of maternal obesity on the accuracy of fetal weight estimation. Obstet Gynecol 1995;86:102–7.
18. O'Reilly-Green CP, Divon MY. Receiver operating characteristic curves of sonographic estimated fetal weight for prediction of macrosomia in prolonged pregnancies. Ultrasound Obstet Gynecol 1997;9:403–8.