Interest in the associations between early-life factors and adult health outcomes has rapidly expanded over the past decade. Additionally, exposures during pregnancy can affect the health of the mother later in life.1 Studies of prenatal exposures and events at childbirth often rely on maternal recall decades later.2–6 The literature is mixed regarding how well women are able to recall behaviors during pregnancy.7–9 Studies have found good agreement between short-term maternal recall of routinely collected birth information (child’s birthweight and duration of pregnancy) and medical records,10–12 but studies of the accuracy of long-term recall are limited.13–15
The present study evaluated the long-term recall of maternal prepregnancy weight, several early pregnancy exposures, duration of pregnancy, and child’s birthweight in an actively engaged, healthy cohort of women originally enrolled during the early 1980s. We describe the accuracy of reporting at a follow-up interview approximately 30 years after the study pregnancy. We also assessed whether subjective certainty about a given answer at follow-up would predict accuracy of recall.
METHODS
The North Carolina Early Pregnancy Study (“the baseline study” from here forward) was a prospective cohort study conducted in 1982–1986 to estimate the incidence of early pregnancy loss (eAppendix 1; https://links.lww.com/EDE/B192).16 Participants were recontacted in 2010–2011 as part of the North Carolina Early Pregnancy Study Follow-up (“follow-up” from here forward); 173 of 210 survivors completed a follow-up questionnaire. Women reported previously collected information about their baseline study pregnancy. For recalled early pregnancy behaviors, participants were asked to indicate how sure they were of their responses using a 4-level scale (1-quite unsure, 2-unsure, 3-sure, and 4-quite sure).
Prepregnancy weight was reported in pounds in the baseline study and at follow-up. Body mass index (BMI) was calculated as weight in kilograms divided by height in meters squared. Using World Health Organization (WHO) BMI categories, the majority of participants had a normal BMI (18.5–24.9).17 To have more balanced groups, women’s BMI was categorized as lower normal (<21.0), higher normal (21.0–24.9), and overweight/obese (≥25.0).
In the baseline study, women prospectively recorded in their daily diary the presence or absence of symptoms they thought might be pregnancy related (eAppendix 2; https://links.lww.com/EDE/B192). At follow-up, women were asked about the timing of the onset of pregnancy-related symptoms in the following categories: ≤6, 7–8, or >8 weeks from their last menstrual period.
Both in the baseline study and at follow-up, women reported their caffeinated and alcoholic beverage intake during the first 8 weeks of pregnancy as number of servings per day, week, or month, which we converted to servings per month for analysis. Because high caffeine intake is a potential concern during pregnancy,18 we evaluated accuracy of recalled total caffeine as calculated from coffee, tea, and soft drinks, the primary sources assessed in the baseline study. Overall caffeine intake was calculated by multiplying the number of monthly servings of each beverage type by their typical caffeine content in the early 1980s.19 A serving of alcohol was defined as 12 ounces of beer, 4 ounces of wine, or 1.5 ounces of hard liquor, and considered equivalent. Monthly servings of all alcoholic beverages were summed for overall alcohol intake.
During the baseline study, duration of pregnancy was calculated as the number of days from last menstrual period (based on bleeding recorded in the daily diary) to the birth date of the child. At follow-up, participants reported duration of pregnancy (in weeks) for their baseline study pregnancy; women with live births also reported how many days before or after their due date they delivered. Birthweight was reported as pounds and ounces.
Measures collected during the baseline study were considered the gold standard. Sensitivity was calculated as the proportion of women who reported being exposed at follow-up among those who were “truly exposed” based on their baseline study data. Specificity was calculated as the proportion of women who reported no exposure at follow-up among those who were “truly not exposed.” Accuracy was calculated as the proportion of women who correctly reported being exposed or not exposed at follow-up. The proportion with accurate recall who reported they were quite sure or sure of their responses at follow-up was compared with the proportion quite sure or sure with inaccurate recall. For women with inaccurate responses at follow-up, we calculated the proportion over- and under-reporting to characterize the direction of misclassification at recall. For reported continuous measures including prepregnancy weight, duration of pregnancy, and child’s birthweight, we calculated the difference in reported weight or age (baseline study–follow-up).
RESULTS
There were 120 women who completed both the 8-week pregnancy and follow-up questionnaires (Figure 1). The analysis was limited to the 109 women whose pregnancies resulted in a live birth (104 singletons, five sets of twins). The majority of women were white (95%) and well-educated (78% had a college degree or greater). The median age at follow-up was 55 years (eTable 1; https://links.lww.com/EDE/B192). Half of the women were nulliparous at the time of baseline enrollment.
FIGURE 1: Flowchart of participants from the baseline Early Pregnancy Study and Early Pregnancy Study Follow-up who were included in the analysis.
Prepregnancy Weight
Of the 106 women who were able to provide their prepregnancy weight at follow-up, 89% reported a weight within 10 pounds of what they had reported at enrollment. Women reporting a different prepregnancy weight (n = 90) were more likely to overestimate their weight (68%) than underestimate (32%). When we categorized women by the three BMI categories created for this study, 81% of participants recalled a prepregnancy weight that placed them in the same BMI category as their originally reported weight. Stratifying by parity at the time of enrollment, more nulliparous women recalled a weight in the correct BMI category (nulliparous 88%, parous 74%).
Onset of Symptoms of Pregnancy
Most women (78%) had prospectively reported the onset of symptoms of pregnancy within the first 6 weeks from their last menstrual period while on the follow-up questionnaire women tended to report later onset of symptoms. Using the response choices provided for women at follow-up (≤6, 7–8, or >8 weeks from last menstrual period), 52% recalled the timing correctly, 39% chose a longer time, and 9% chose a shorter time to onset of symptoms, compared with their prospective reports (eTable 2; https://links.lww.com/EDE/B192).
Early Pregnancy Exposures
The Table and eTable 3 (https://links.lww.com/EDE/B192) provide a summary of how well women were able to recall behaviors (yes/no) during early pregnancy. Brewed coffee and tea accounted for 75% of the caffeine consumption. Both were recalled with at least 70% accuracy, and with sensitivity and specificity close to 0.70 (Table; eTable 4; https://links.lww.com/EDE/B192). Women tended to over-report overall caffeine consumption at follow-up (eFigure 1; https://links.lww.com/EDE/B192).
TABLE: Report of Exposures During Early Pregnancy at 8 Weeks of Pregnancy and at Follow-up from the North Carolina Early Pregnancy Study (Total n = 109)
During the baseline study, 39% of women had reported drinking alcoholic beverages in early pregnancy (28% reported drinking wine). The accuracy of recall for drinking wine was 67% (sensitivity: 0.30, specificity: 0.81). All of the 19 women who reported infrequent wine drinking during early pregnancy (1–2 servings/month) reported inaccurately at follow-up (four reported drinking a greater amount and 15 reported abstaining; eTable 5; https://links.lww.com/EDE/B192). Recall of beer drinking at follow-up was 80% accurate (sensitivity: 0.25, specificity: 0.92). Consumption of hard liquor was uncommon. Women tended to under-report overall alcohol consumption at follow-up (eFigure 2; https://links.lww.com/EDE/B192). A woman’s confidence in her report of caffeinated and alcoholic beverages at follow-up was not associated with greater accuracy (eTable 6; https://links.lww.com/EDE/B192).
Women over-reported their use of vitamins at follow-up and under-reported their use of antibiotics. The accuracy of recall for vitamin use was 65% (sensitivity: 0.82, specificity: 0.26). The sensitivity for antibiotic use recall was 0.14; only one woman reported antibiotic use at the follow-up interview out of the eight that had reported this exposure during pregnancy. All women who reported no use of antibiotics at 8 weeks correctly reported their nonuse in the follow-up questionnaire (specificity 1.00).
Birthweight
Among women at follow-up who had given birth to singletons, 94% reported their child’s birthweight within ½ pound (Figure 2A). All five women who gave birth to twins reported the same two birthweights in the baseline study and at follow-up.
FIGURE 2: A, Differences in report of child’s birthweight (in ounces; baseline study report—follow-up report). B, Differences in report of duration of pregnancy (in weeks, based on how many days before or after their due date women reported their pregnancy lasted at follow-up; baseline study report—follow-up report).
Duration of Pregnancy
Using women’s follow-up responses to how many days before or after their due date they gave birth, only 16% of women correctly reported exact duration of pregnancy in days. However, 86% of women correctly reported within ±7 days (14-day window) (Figure 2B). Using women’s responses of how many weeks their pregnancy lasted, at follow-up, 43% of women accurately reported the number of weeks (7-day window) and 86% reported accurately within ±1 week (21-day window).
DISCUSSION
Our data suggest that long-term recall of early pregnancy exposures is difficult, and self-report of this type of information close to 30 years after pregnancy results in high levels of misclassification (median sensitivity was 0.30, range: 0.07–0.82 and median specificity was 0.81, range: 0.26–1.00). Some of the inaccuracy in reporting pregnancy behaviors at follow-up appeared to be influenced by social desirability (under-reporting original alcohol intake and antibiotic use, but over-reporting vitamin use). Recall was better for those never or frequently consuming caffeinated or alcoholic beverages, suggesting these behavior patterns are more memorable, or perhaps more consistent over time. Furthermore, while women in our study were not good at recalling how much caffeine they had consumed, the majority reported consuming less than the recommended limit of 200 mg of caffeine per day at both time points.18 Women’s reporting at follow-up may have been influenced by behaviors during subsequent pregnancies. However, when we calculated sensitivities and specificities for each of the early pregnancy behaviors stratified by whether women had a child after the baseline study, we did not observe a pattern of poorer recall among women who had additional children (n = 66) compared with women who did not (n = 42). In contrast to the relatively poor ability to recall specific common exposures during pregnancy, most women were able to recall their prepregnancy weight, duration of pregnancy, and child’s birthweight with reasonable accuracy.
Two previous studies of long-term maternal recall (4–12 years) of early pregnancy behaviors reported moderate accuracy. These studies differed from ours in that medical records and registry data were used as the gold standard for comparison, which were incomplete for early pregnancy exposures.8,9 In some studies, greater quantities of alcohol have been reported retrospectively compared with prospectively.20,21 This difference may be a result of study populations that differed from ours on demographic and health-related factors.
As in another study,22 we found women accurately recalled their prepregnancy weight. Women were also able to recall the duration of pregnancy and birthweight of their baseline study baby with a high level of accuracy, consistent with other studies of both long- and short-term recall.10,14 Recalled duration of pregnancy was more accurate when asked as days before or after due date compared with total weeks of pregnancy. At follow-up, women recalled the onset of pregnancy symptoms later than their prospective report.
The women in this study all survived to the follow-up period, were able to be recontacted, and agreed to participate in the follow-up study. Additionally, these committed volunteers were well educated and white and all had planned their pregnancies, which may limit the generalizability of our results to more diverse populations (e.g., too few smokers to allow for meaningful analysis of smoking behavior). Our results are likely a best-case scenario of long-term maternal recall. Although our sample was small, item nonresponse was rare on the follow-up questionnaire, unlike other previous studies of long-term maternal recall.8,9,13 As our gold standard, we had prospective measures for duration of pregnancy, onset of pregnancy symptoms, and prepregnancy weight. The baseline study questionnaire data on behaviors during early pregnancy were based on a very short recall time. This allowed us to study recall of early pregnancy behaviors, whereas most studies on maternal recall of pregnancy-related events have focused on labor and birth outcomes.
The sensitivities and specificities for early pregnancy behaviors presented in this article can be used to inform investigators of the potential level of misclassification of various early pregnancy behaviors in studies where this information is collected retrospectively years later and may aid in decision making when considering a bias analysis. The lack of concordance between perceived confidence in recall and its accuracy suggest no benefit from including such questions.
ACKNOWLEDGMENTS
The authors thank Drs. Matthew Longnecker and Srishti Shrestha for their feedback on an earlier draft of this manuscript.
REFERENCES
1. Xu X, Dailey AB, Peoples-Sheps M, Talbott EO, Li N, Roth J. Birth weight as a risk factor for breast cancer: a meta-analysis of 18 epidemiological studies. J Womens Health (Larchmt). 2009;18:1169–1178.
2. Bailey HD, Armstrong BK, de Klerk NH, et al; Aus-ALL Consortium. Exposure to professional pest control treatments and the risk of childhood acute lymphoblastic leukemia. Int J Cancer. 2011;129:1678–1688.
3. Dougan MM, Willett WC, Michels KB. Prenatal vitamin intake during pregnancy and offspring obesity. Int J Obes (Lond). 2015;39:69–74.
4. Linnet KM, Dalsgaard S, Obel C, et al. Maternal lifestyle factors in pregnancy risk of attention deficit hyperactivity disorder and associated behaviors: review of the current evidence. Am J Psychiatry. 2003;160:1028–1040.
5. Chyi LJ, Lee HC, Hintz SR, Gould JB, Sutcliffe TL. School outcomes of late preterm infants: special needs and challenges for infants born at 32 to 36 weeks gestation. J Pediatr. 2008;153:25–31.
6. Uiterwaal CS, Anthony S, Launer LJ, et al. Birth weight, growth, and blood pressure: an annual follow-up study of children aged 5 through 21 years. Hypertension. 1997;30(2 Pt 1):267–271.
7. Feldman Y, Koren G, Mattice K, Shear H, Pellegrini E, MacLeod SM. Determinants of recall and recall bias in studying drug and chemical exposure in pregnancy. Teratology. 1989;40:37–45.
8. Rice F, Lewis A, Harold G, et al. Agreement between maternal report and antenatal records for a range of pre and peri-natal factors: the influence of maternal and child characteristics. Early Hum Dev. 2007;83:497–504.
9. Jaspers M, de Meer G, Verhulst FC, Ormel J, Reijneveld SA. Limited validity of parental recall on pregnancy, birth, and early childhood at child age 10 years. J Clin Epidemiol. 2010;63:185–191.
10. Troude P, L’Hélias LF, Raison-Boulley AM, et al. Perinatal factors reported by mothers: do they agree with medical records? Eur J Epidemiol. 2008;23:557–564.
11. Hakim RB, Tielsch JM, See LC. Agreement between maternal interview- and medical record-based gestational age. Am J Epidemiol. 1992;136:566–573.
12. Poulsen G, Kurinczuk JJ, Wolke D, et al. Accurate reporting of expected delivery date by mothers 9 months after birth. J Clin Epidemiol. 2011;64:1444–1450.
13. Yawn BP, Suman VJ, Jacobsen SJ. Maternal recall of distant pregnancy events. J Clin Epidemiol. 1998;51:399–405.
14. Buka SL, Goldstein JM, Spartos E, Tsuang MT. The retrospective measurement of prenatal and perinatal events: accuracy of maternal recall. Schizophr Res. 2004;71:417–426.
15. Tomeo CA, Rich-Edwards JW, Michels KB, et al. Reproducibility and validity of maternal recall of pregnancy-related events. Epidemiology. 1999;10:774–777.
16. Wilcox AJ, Weinberg CR, O’Connor JF, et al. Incidence of early loss of pregnancy. N Engl J Med. 1988;319:189–194.
17. Rasmussen KM, Yaktine AL; Institute of M, National Research Council Committee to Reexamine IOMPWG. The National Academies Collection: Reports funded by National Institutes of Health. In: Weight Gain During Pregnancy: Reexamining the Guidelines. 2009.Washington, DC: National Academies Press (US) National Academy of Sciences.
18. ACOG CommitteeOpinion No. 462: Moderate caffeine consumption during pregnancy. Obstet Gynecol. 2010;116(2 pt 1):467–8.
19. Wilcox A, Weinberg C, Baird D. Caffeinated beverages and decreased fertility. Lancet. 1988;2:1453–1456.
20. Hannigan JH, Chiodo LM, Sokol RJ, et al. A 14-year retrospective maternal report of alcohol consumption in pregnancy predicts pregnancy and teen outcomes. Alcohol. 2010;44:583–594.
21. Newport DJ, Brennan PA, Green P, et al. Maternal depression and medication exposure during pregnancy: comparison of maternal retrospective recall to prospective documentation. BJOG. 2008;115:681–688.
22. Krakowiak P, Walker CK, Tancredi DJ, Hertz-Picciotto I. Maternal recall versus medical records of metabolic conditions from the prenatal period: a validation study. Matern Child Health J. 2015;19:1925–1935.