Birthweight is often used as a proxy for fetal weight. Problems with this practice have recently been brought to light. We explore whether data available at birth can be used to predict estimated fetal weight using linear and quantile regression, random forests, Bayesian additive regression trees, and generalized boosted models. We train and validate each approach using 18,517 pregnancies (31,948 ultrasound visits) from the Magee-Womens Obstetric Maternal and Infant data and 240 pregnancies in a separate dataset of high-risk pregnancies. We also quantify the relation between smoking and small-for-gestational-age birth, defined as a birthweight in the lower 10th percentile of a population birthweight standard and estimated and predicted fetal weight standard. Using mean squared error and median absolute deviation criteria, quantile regression performed best among the regression-based approaches, but generalized boosted models performed best overall. Using the birthweight standard, smoking during pregnancy increased the risk of small-for-gestational-age 3.84-fold (95% CI: 2.70, 5.47). This ratio dropped to 1.65 (95% CI: 1.50, 1.81) when using the correct fetal weight standard, which was no different from the machine learning–based predicted standards, but higher than the regression-based predicted standards. Machine learning algorithms show promise in recovering missing fetal weight information. See video abstract at, http://links.lww.com/EDE/B314.
From the aDepartment of Epidemiology, University of Pittsburgh, Pittsburgh, PA; bDepartment of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, QC, Canada; and cMagee Women’s Research Institute, University of Pittsburgh; Pittsburgh, PA.
Submitted August 2, 2016; accepted November 14, 2017.
Supported, in part, by the University of Pittsburgh Center for Research Computing through the computing resources provided, and the assistance of Dr. Kim Wong.
Code/Data Availability: All software coded needed to reproduce these analyses is available on https://github.com/ainaimi.
The authors report no conflicts of interest.
Supplemental digital content is available through direct URL citations in the HTML and PDF version of this article (www.epidem.com).
Correspondence: Ashley I. Naimi, Department of Epidemiology, University of Pittsburgh, 130 DeSoto Street, 503 Parran Hall, Pittsburgh, PA 15261. E-mail: email@example.com.