Background: Perinatal epidemiology studies often collect only the calendar month in which an event occurs in early pregnancy because it is difficult for women to recall a specific day when queried later in pregnancy or postpartum. Lack of day information may result in incorrect assignment of completed gestational month because calendar months and pregnancy months are not aligned.
Methods: To examine the direction and magnitude of misclassification, we compared 3 methods for assignment of completed gestational month: 1) calendar month difference, 2) conditional month difference, and 3) imputed month midpoint. We used data from the Pregnancy, Infection, and Nutrition Study for simulations.
Results: Calendar month difference misclassified 54% of events as 1 month later in pregnancy compared with the actual completed month of gestation. Each of the other 2 methods misclassified approximately 12% of events to 1 month earlier and 12% to 1 month later.
Conclusions: Calendar month difference, a common method, has the greatest misclassification. Conditional month difference and imputed month midpoint, which require little effort to implement, are superior to calendar month difference for reducing misclassification.
From the *Departments of Epidemiology and §Biostatistics, School of Public Health, the †Carolina Population Center, and the ‡Department of Obstetrics and Gynecology, School of Medicine, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina.
Submitted 23 May 2004; final version accepted 21 September 2004.
The Pregnancy, Infection, and Nutrition Study was financially supported by the National Institute of Child Health and Human Development, National Institutes of Health (grant #HD28684), the Association of Schools of Public Health/Centers for Disease Control and Prevention (cooperative agreement S455/16-17), the Centers for Disease Control and Prevention (grant #U64/CCU412273), NC Healthy Start Foundation, Wake Area Health Education Center, Raleigh, NC; the March of Dimes Birth Defects Foundation (grant #6-FY99-401, grant #6-FY01-42); the Wake Area Health Education Center in Raleigh, North Carolina, and the University of North Carolina at Chapel Hill Institute of Nutrition.
Correspondence: Juan Yang, California Birth Defect Monitoring Program, 1917 Fifth Street, Berkeley, CA 94710. E-mail: email@example.com.
Many events in early pregnancy such as vaginal bleeding, high fever, or over-the-counter medication use are of interest to perinatal epidemiologists as potential determinants of subsequent outcomes. However, the precise timing of events is difficult for women to recall when queried later in pregnancy or postpartum. Therefore, investigators often collect only the calendar month in which the event occurs to avoid the burden of seeking the specific day, which is likely to be inaccurate. Lack of day information can lead to incorrect assignment of completed gestational month as a result of the misalignment of calendar months and pregnancy months. A common method of assigning a completed gestational month to an event during pregnancy is the calendar month difference between the month of the last menstrual period (LMP) and the month of the event. We developed 2 other methods, conditional month difference and imputed month midpoint, for assignment of completed gestational month to reduce misclassification. This report compares the magnitudes and direction of misclassification of the 3 methods using data from the Pregnancy, Infection, and Nutrition (PIN) Study to conduct simulations. To our knowledge, no published study has addressed this issue.
The PIN Study is a prospective cohort study that recruited pregnant women at 24 to 29 weeks' gestation from prenatal clinics affiliated with a university medical center and a county health department in central North Carolina. Women were interviewed by telephone within 2 weeks of enrollment to collect information about risk factors for preterm birth.1 This report incorporated data from 2829 participants whose pregnancies started between January 1995 and August 2000.
Gestational age was computed based on the first day of the LMP as follows. The adjusted LMP date was the self-reported LMP date if the gestational age was within 14 days of that estimated by an ultrasound examination. Otherwise, we used the date estimated by ultrasound. The study was approved by Institutional Review Boards at University of North Carolina and WakeMed Hospitals. All participants provided informed consent.
To evaluate the pattern and magnitude of misclassification of completed gestational month, we used the interview date as an example of an event date. Based on the adjusted LMP date (described previously) and the known interview date (the complete calendar year, month, and day), we calculated the actual completed month of gestation at the time of the interview. We then treated the calendar day as unknown and assigned the gestational month by 3 approaches (as described subsequently). We compared the assigned completed gestational month with the actual completed gestational month at the time of the interview to evaluate the direction and magnitude of the misclassification as a result of lack of information on event day.
We applied the methods of calendar month difference, conditional month difference, and imputed month midpoint to assign a completed gestational month to the interview event. Calendar month difference assigned completed gestational month of the interview as the calendar month difference between LMP month and the event month. For example, a woman with an adjusted LMP date of June 10 who was interviewed in August (on a day considered to be unknown) was recorded as interviewed in completed gestational “month 2.”
Conditional month difference was conditional on the known LMP day. If the estimated LMP day was on or before the 15th day of the month, the number of calendar month difference was assigned; if the LMP day was the 16th day of the month or later, calendar month difference minus 1 was assigned. For example, among women who were interviewed in August (on a day considered to be unknown), a woman with an LMP date of June10 was coded as interviewed in completed gestational “month 2,” but a woman who had an LMP date June 20 was assigned as interviewed in completed gestational “month 1.”
Imputed month midpoint imputed the actual midpoint of a specific event month (eg, day 16 for months of 31 days) as calendar day of the event. Then we calculated the completed gestational month between the imputed event date and the adjusted LMP date.
Based on actual day of interview information, most (84% of 2829) women were interviewed at completed gestational month 6 as stipulated in the PIN Study protocol. Ten percent women were interviewed at gestational month 5, 6% at gestational month 7, and 2 women at gestational month 4.
Table 1 summarizes the misclassification pattern of gestational month assignments using each of the 3 methods. Calendar month difference misclassified 54% of interviews to 1 month later than the actual month. Conditional month difference and imputed month midpoint misclassified approximately 24% of completed gestational months, half 1 month early and half 1 month late.
This analysis demonstrates that conditional month difference and imputed month midpoint meaningfully reduce misclassification in assignment of completed gestational month, compared with the simpler calendar month difference. Calendar month difference defines the completed gestational month without taking the LMP day into account, whereas conditional month difference and imputed month midpoint assign completed gestational month conditioned on the estimated LMP day.
Figure 1 indicates the misclassification mechanisms using calendar month difference, conditional month difference, and imputed month midpoint. If a woman has an LMP date on June 10 and an event in August for which the date is unknown, the 3 methods assign her the same completed gestational “month 2” at the event. However, the actual completed gestational month is “month 1” if the event occurs between August 1 and August 9, leading in this case to a misclassification probability of 9 of 30 for events occurring in August (assuming 30 days in every month).
Misclassification with the calendar month difference is worse if the LMP is later in the month. For example, a woman with an LMP on June 20 and an event in August for which the date is unknown will be assigned “month 2” at the event, which would be a misclassification of 19 of the 30 days in the month. Conditional month difference and imputed month midpoint assign the woman to completed gestational “month 1.” With those same dates, there would be a misclassification probability of only 11 of 30 for conditional month difference and imputed month midpoint.
Conditional month difference and imputed month midpoint result in less misclassification in assignment of completed gestational month when the LMP occurs in the second half of the month. Our simulation analysis confirms this pattern. We have shown that conditional month difference and imputed month midpoint are preferable to calendar month difference (and similar to each other) for reducing misclassification and balancing the distribution of misclassification in both directions. Of these 2 methods, we recommend imputed month midpoint because it uses the exact midpoint of the calendar month.
We acknowledge the contribution of Tom Swasey in the design of Figure 1.
1. Savitz DA, Dole N, Williams J, et al. Study design and determinants of participation in an epidemiologic study of preterm delivery. Paediatr Perinat Epidemiol