Click on the links below to access all the ArticlePlus for this article.
Please note that ArticlePlus files may launch a viewer application outside of your web browser.
During the last 2 decades, participation rates in large cohort studies have decreased from 80% to 30–40%.1–7 This is potentially a problem because these studies provide much of the evidence on preventable causes of diseases. In a cohort study based upon prospective data, the decision to participate cannot be based upon future outcomes, and the risk of bias due to nonparticipation has therefore been deemed negligible.8,9 Hence, investigators have considered dropout during follow-up a much more serious threat to the internal validity of the study and have tried to minimize this problem at the expense of participation rates. However, the decision to participate may correlate with social, educational, and health conditions. These conditions may again correlate with risk factors for the outcome under study, and some selection bias cannot be ruled out.10 Here, we study the bias that arises when the selection into a study leads to effect estimates among participants that differ from those found in the source population. Our main concern is not to explain the mechanisms behind this bias, but to quantify its size in a specific context.
A low participation rate of approximately 30% also was encountered in the newly established Danish National Birth Cohort,11 which is a nationwide study of 100,000 pregnant women and their offspring. Recruitment occurred at the first antenatal care visit. About half of nonparticipation was caused specifically by a lack of participation by family doctors, whereas the other half was attributable to pregnant women who declined the invitation. Data collection on all births in 2 well-defined geographical areas in Denmark, independent of the establishment of the cohort study, gave us a unique opportunity to investigate the impact of the low participation rate on the relative risk estimates derived from this cohort. However, we found no published method to obtain confidence intervals that could be used when comparing 2 nonindependent estimates, ie, the relative risks among the participants and the source population.
The purpose of the present study was therefore two-fold: to quantify the impact of the initial selection into the Danish National Birth Cohort on 3 different associations between well-established risk factors and pregnancy outcomes and to evaluate 2 methods for constructing confidence intervals for estimates of bias.
The birth register of North Jutland County had information about all singleton births of women in this area who were eligible for inclusion in the Danish National Birth Cohort during the recruitment period from 1997 to 2002 (n = 30,628). Similarly, the Aarhus Birth Cohort, which is an ongoing data collection on all infants delivered at the University Hospital of Aarhus, provided information about all singleton births of women who were residents in the Aarhus Municipality during pregnancy (n = 19,123). Thus, the source population for the present investigation included 49,751 singleton births (185 stillbirths and 49,566 live births), which corresponded to 16% of the entire eligible population of the national cohort study (approximately 310,000 pregnancies). The unique personal identifier given to all Danish citizens at birth was used to identify all women (births) in the source population who also participated in the national study. This was the only information from the Danish National Birth Cohort that was used in this study. We found 15,373 births corresponding to a participation rate of 31%.
We investigated the possible effect of this low participation rate on 3 well-established associations, each of them representing different potential selection mechanisms: (a) in vitro fertilization (IVF) and preterm birth,12 (b) smoking during pregnancy and birth of a small-for-gestational-age infant (SGA),13 and (c) prepregnancy body mass index (BMI: weight in kilograms/height in meters squared) and antepartum stillbirth.14–16
In North Jutland County, information about smoking during pregnancy was obtained by the midwives and reported to the database at birth, together with comprehensive information related to pregnancy and delivery. Prepregnancy weight and height as obtained at the first antenatal visit were included in the database from September 1999 onwards, and subsequently recorded for 82% of all women. In the Aarhus Municipality, 94% of all women who were scheduled for delivery at Aarhus University Hospital returned a questionnaire at the end of the first trimester. The questionnaire included questions on smoking, prepregnancy weight, and height. Birth data were extracted from medical files by trained midwives.
Preterm birth was defined as the birth of a liveborn infant before 37 completed weeks of gestation. The estimate of gestational age was initially based on last menstrual period but corrected by early ultrasound if necessary. SGA was defined as a birth weight more than 2 standard deviations below the mean for gestational age on the reference curve suggested by Marsal et al17 Antepartum stillbirth was defined as the expulsion of a dead fetus after 28 completed weeks of gestation and before the onset of labor.
We categorized BMI (in kg/m2) as underweight (<18.5), normal weight (18.5–24.9), overweight (25–29.9), and obese (30+).18 Cigarette smoking in pregnancy was divided into 3 categories. IVF included all types of assisted reproduction that involved the use of in vitro fertilization. Other variables in the analyses were age and parity, and the categorization of all covariates is displayed in Table 1. To further facilitate comparisons of participants and the source population, this table also includes information on smoking cessation in pregnancy, postterm birth, high birth weight, and Apgar score after 5 minutes. The study was based on already existing data collections and did not involve any contact with the pregnant women. The study was approved by the Danish Data Protection Board.
Maternal characteristics and outcome in the source population and in the participant population were described by relative frequencies (RFs). To describe the pattern of participation we subsequently compared the 2 populations by computing ratios of relative frequencies (RRFs), ie, RRF = RFParticipants/RFSourcepopulation. Next, we investigated the effect of the participation on the odds ratio (OR) estimates. For each population and for each outcome the effect of the exposure was described by an odds ratio estimated by a logistic regression adjusting for age, parity, and subsample (the North Jutland County or the Aarhus Municipality). For IVF and preterm birth and for BMI and stillbirth, we also adjusted for smoking. Following Austin et al19 and Kleinbaum et al,9 we defined the relative odds ratio (ROR) as the ratio of the OR among participants to the corresponding OR in the source population, ie, ORParticipants/ORSourcepopulation. If ROR is equal to one, then no bias is present. The ROR based on crude ORs also can be computed as the cross product ratio of the participation rates in the exposure by outcome categories.9,19,20
Confidence limits were found by applying 2 methods described below to the logarithm of the parameter of interest, eg, ln(ROR) = ln(ORParticipants) − ln(ORSourcepopulation). Because an estimate based on a total sample, ˆθTot, and the corresponding estimate obtained in a subsample, ˆθSub, are inherently positively correlated, standard methods do not apply. We considered 2 methods for obtaining standard errors of the difference.
The first method was based on the observation that, if the subsample is a random sample from the total sample, then for large samples we have Var (ˆθSub − ˆθTot) = Var (ˆθSub) − Var (ˆθTot). (A heuristic proof of this formula and further comments on the interpretation of the confidence intervals are available with the electronic version of this article at www.epidem.com; click on ArticlePlus.) Although the formula does not hold in general, ie, if the subsample is not a random sample from the total sample, we hoped that the approach was approximately valid. This would ensure that confidence intervals derived from
would have coverage probabilities close to the nominal level.
The second method was to use a nonparametric bootstrap method, ie, to estimate the standard error by the standard deviation of estimates found by resampling from the observed data.21 In this study a bootstrap sample of size 200 was chosen.
The performance of the 2 methods was assessed in a small simulation study covering scenarios identical or close to the actual data included in the adjusted analyses of IVF and preterm birth and of smoking and SGA. The simulation set-up required specification of the distribution of the covariate pattern in the source population, the participation rate within each covariate pattern, and the dependence of the outcome on the covariates both among the participants and among the nonparticipants. (For further description of the simulations, see the electronic version of the article at www.epidem.com; click on ArticlePlus.)
For each scenario, we considered source population sizes of 25,000 and 50,000. Each combination of scenario and sample size was simulated 5000 times. The results of the simulations were summarized by the observed coverage probability of a nominal 95% interval of ROR, ie, by computing how often the true ROR was contained in the interval estimated ROR · exp(±1.96 · se). All analyses and simulations were made using STATA version 8.2 (StataCorp, College Station, TX).
Among participants in the Danish National Birth Cohort we found a modest overrepresentation of 25- to 35-year-old women, and of women giving birth to their first or second child (Table 1). Participants were more often of normal weight, nonsmokers or previous smokers, and more likely to have IVF pregnancies. They had a lower rate of preterm deliveries, of infants with very low Apgar scores, and of infants with SGA, as well as a higher rate of infants with high birth weight.
Table 2 shows the adjusted ORs for each of the chosen associations in the source population and among participants in the Danish National Birth Cohort. For IVF and preterm birth, the 2 ORs were identical, and for BMI and antepartum stillbirth, the ORs were very similar. The association between smoking and SGA was slightly stronger among the participants for women smoking more than 10 cigarettes per day. Overall, the 2 sets of estimates were remarkably similar, and none of the relative odds ratios deviated more than 16% from 1. In no case did the data contradict the hypothesis of identical odds ratios among participants and in the source population, but existence of some bias related to selection could not be ruled out. Especially when both the outcome and the exposure were rare, the confidence intervals for the relative odds ratio were wide.
Participation was in general nondifferential, ie, the dependence on the outcome category was consistent across exposure categories—the only exception being heavy smokers, among whom also the highest bias was observed. (These results are available with the electronic version of the article at www.epidem.com; click on ArticlePlus.)
The 2 methods of calculating confidence intervals gave almost identical results (Tables 1 and 2). The simulation study showed that both methods gave approximately valid confidence intervals for the considered scenarios (Table 3). The simple approach based on equation 1 gave 95% confidence intervals with coverage probabilities in the range from 94.4% to 96.0%. The nonparametric bootstrap approach with a bootstrap sample of size 200 gave coverage probabilities in the range from 95.0% to 96.6%.
Our results indicate that participants in the Danish National Birth Cohort were somewhat healthier than mothers in the source population, but differential participation was modest and the estimated effect on the risk estimates was small, even after minimal confounder adjustments.
We believe that these findings can be generalized to the entire Danish National Birth Cohort because the present study is based on data from both a large rural county and from the municipality that holds the second largest city in Denmark. Moreover, more than 99.5% of all births in Denmark take place at public hospitals. We had some selection to part of the study because 6% did not respond to the pregnancy questionnaires in the Aarhus source population. Restricting the analysis to North Jutland County where we had essentially complete information on all covariates except BMI did, however, not change the findings. In North Jutland incomplete registration at birth by the midwives caused an 18% nonresponse on information about BMI. We find it unlikely that missing information distorted the results.
We chose exposures for which we expected different participation rates and we did indeed find differences. These exposures represent, however, only a small subset of all the associations that will be investigated in the national cohort study. We cannot exclude that other associations may have a larger bias due to nonparticipation, but the findings are generally reassuring.
Other cohort studies have compared descriptive statistics in participants and nonparticipants6,22–30 and also found differences of varying sizes.6,22,24,26–30 However, Greenland31 demonstrated more than 25 years ago that, even with similar marginal distributions in participants and in the source population, bias may still be present if participation depends on both exposure and outcome. We only found one study25 that had actually calculated relative risk estimates for both participants and the source population, but this study was too small to provide useful information.
The relative OR and the cross product ratio of the participation rates were mainly developed to clarify the effect of selection, to identify different response patterns, and as a tool to carry out sensitivity analysis in a given setting.9,19 No method for obtaining confidence intervals for the relative OR has previously been presented. We reviewed studies reporting on dropouts, missing data, and nonparticipation in other types of study designs because we found it likely that the same statistical problem was encountered here. Bias estimates have been presented without confidence limits,32–34 which complicates the interpretation. Other researchers tested the differences between relative risk estimates of nonparticipants and participants,35,36 or presented P values of test for no interaction between the effect of participation status and exposure in the source population.37 The test results were subsequently used to infer the likelihood of bias. However, these approaches do not permit such interpretations, because the risk estimates among participants and nonparticipants may be different without bias, and they may be similar in the presence of bias.
The lack of available methodology led us to evaluate 2 approaches for computing confidence intervals of the relative odds ratio, and to validate their performance in a small simulation study. The first method is computationally very simple but lacks theoretical justification if bias is present. We therefore included scenarios with bias in the simulation study. The second method was to use nonparametric bootstrap. This method is computationally more demanding, especially in large studies. The simulation study showed that both methods performed equally well, with confidence intervals close to the nominal 95%. Although we recognize the limitation of this simulation study, it does suggest that the simple method is valid at least when the expected bias related to the selection is modest.
The present study is, to our knowledge, the first study to give a detailed analysis of the effects of nonparticipation in cohort studies on estimates of relative risk. We had both considerable nonparticipation and differences in the marginal frequencies between the participants and source population, but we found only small effects on the relative risk estimates. We acknowledge that our findings are related to specific associations in a population of pregnant women and are therefore not generalizable to other populations or different recruitment strategies. Nevertheless, we find the results reassuring, because one of the main arguments for conducting a cohort study with prospective data is to avoid selection bias of the relative risk estimates. Avoiding selection bias related to lack of compliance to the follow-up is probably still a much more important issue.
We are grateful to Professor Michael Væth for helpful discussions on statistical issues and for advice on the presentation of the results.
1. Andersen LB, Vestbo J, Juel K, et al. A comparison of mortality rates in three prospective studies from Copenhagen with mortality rates in the central part of the city, and the entire country. Copenhagen Center for Prospective Population Studies. Eur J Epidemiol
2. Barton J, Bain C, Hennekens CH, et al. Characteristics of respondents and non-respondents to a mailed questionnaire. Am J Public Health
3. Boeing H, Korfmann A, Bergmann MM. Recruitment procedures of EPIC-Germany. European Investigation into Cancer and Nutrition. Ann Nutr Metab
4. Brown WJ, Bryson L, Byles JE, et al. Women’s Health Australia: recruitment for a national longitudinal cohort study. Women Health
5. Hartge P. Raising response rates: getting to yes [editorial]. Epidemiology
6. Manjer J, Carlsson S, Elmstahl S, et al. The Malmo Diet and Cancer Study: representativity, cancer incidence and mortality in participants and non-participants. Eur J Cancer Prev
7. Olsen A, Tjonneland A, Engholm G, et al. Socio-economic determinants for participation in the Danish EPIC Diet, Cancer and Health cohort. IARC Sci Publ
8. Hennekens CH, Buring JE. Epidemiology in Medicine
. Boston: Little, Brown and Co.; 1987.
9. Kleinbaum DG, Kupper LL, Morgenstern H. Epidemiologic Research. Principles and Quantitative Metods
. New York: Van Nostrand Reinhold; 1982.
10. Hernan MA, Hernandez-Diaz S, Robins JM. A structural approach to selection bias. Epidemiology
11. Olsen J, Melbye M, Olsen SF, et al. The Danish National Birth Cohort: its background, structure and aim. Scand J Public Health
12. Basso O, Baird DD. Infertility and preterm delivery, birthweight, and Caesarean section: a study within the Danish National Birth Cohort. Hum Reprod
13. Cnattingius S, Forman MR, Berendes HW, et al. Effect of age, parity, and smoking on pregnancy outcome: a population-based study. Am J Obstet Gynecol
14. Cnattingius S, Bergstrom R, Lipworth L, et al. Prepregnancy weight and the risk of adverse pregnancy outcomes. N Engl J Med
15. Little RE, Weinberg CR. Risk factors for antepartum and intrapartum stillbirth. Am J Epidemiol
16. Stephansson O, Dickman PW, Johansson A, et al. Maternal weight, pregnancy weight gain, and the risk of antepartum stillbirth. Am J Obstet Gynecol
17. Marsal K, Persson PH, Larsen T, et al. Intrauterine growth curves based on ultrasonically estimated foetal weights. Acta Paediatr
18. World Health Organisation Technical Report Services. Obesity: Preventing and Managing the Global Epidemic. Report of a WHO Consultation
. Geneva: World Health Organization; 2000.
19. Austin MA, Criqui MH, Barrett-Connor E, et al. The effect of response bias on the odds ratio. Am J Epidemiol
20. Criqui MH. Response bias and risk ratios in epidemiologic studies. Am J Epidemiol
21. Efron B, Tibshirani RJ. An Introduction to the Bootstrap
. London: Chapman and Hall; 1993.
22. Bond GG, Lipps TE, Stafford BA, et al. A comparison of cause-specific mortality among participants and nonparticipants in a work-site medical surveillance program. J Occup Med
23. Criqui MH, Austin M, Barrett-Connor E. The effect of non-response on risk ratios in a cardiovascular disease study. J Chronic Dis
24. Goldberg M, Chastang JF, Leclerc A, et al. Socioeconomic, demographic, occupational, and health factors associated with participation in a long-term epidemiologic survey: a prospective study of the French GAZEL cohort and its target population. Am J Epidemiol
25. Heilbrun LK, Nomura A, Stemmermann GN. The effects of nonresponse in a prospective study of cancer. Am J Epidemiol
26. Launer LJ, Wind AW, Deeg DJ. Nonresponse pattern and bias in a community-based cross-sectional study of cognitive functioning among the elderly. Am J Epidemiol
27. Macera CA, Jackson KL, Davis DR, et al. Patterns of non-response to a mail survey. J Clin Epidemiol
28. Shahar E, Folsom AR, Jackson R. The effect of nonresponse on prevalence estimates for a referent population: insights from a population-based cohort study. Atherosclerosis Risk in Communities (ARIC) Study Investigators. Ann Epidemiol
29. Sheikh K. Predicting risk among non-respondents in prospective studies. Eur J Epidemiol
30. Walker M, Shaper AG, Cook DG. Non-participation and mortality in a prospective study of cardiovascular disease. J Epidemiol Community Health
31. Greenland S. Response and follow-up bias in cohort studies. Am J Epidemiol
32. Eagan TM, Eide GE, Gulsvik A, et al. Nonresponse in a community cohort study: predictors and consequences for exposure-disease associations. J Clin Epidemiol
33. Hoeymans N, Feskens EJ, Van Den Bos GA, et al. Non-response bias in a study of cardiovascular diseases, functional status and self-rated health among elderly men. Age Ageing
34. Klepp KI. Nonresponse bias due to consent procedures in school-based, health-related research. Scand J Soc Med
35. Benfante R, Reed D, MacLean C, et al. Response bias in the Honolulu Heart Program. Am J Epidemiol
36. Larsen SB, Abell A, Bonde JP. Selection bias in occupational sperm studies. Am J Epidemiol
37. Reijneveld SA, Stronks K. The impact of response bias on estimates of health care utilization in a metropolitan area: the use of administrative data. Int J Epidemiol