Overall, 75% of recruits (including seeds) (n = 684) were offered coupons to recruit others, and of these, 90% (n = 612) accepted (called “recruiters”). Sixty-six percent of recruiters (n = 401) returned to take part in a second interview and to collect their secondary incentives. A similar proportion of recruiters (including seeds) recruited 0, 1, 2, or 3 recruits (Fig. 2E, left bar). Recruits who returned to collect secondary incentives were more likely to have recruited (Fig. 2E, middle and right bar). The proportion of the recruit's network already recruited at the time of their interview increased rapidly during the survey (Fig. 2F; includes seeds). The average number of recruits per recruiter (including seeds) decreased from 2.6 in the first week of the study to 0.6 in the last week that coupons were given out. Only 30% of recruits had been named as a contact by their recruiter (and identified) at their recruiter's first interview.
In the simple random sample survey, 55% (164/300) of men selected were interviewed (4–28 May 2010; eAppendix [Simple random sample survey], http://links.lww.com/EDE/A529). In the qualitative survey, 98% (53/54) of people selected were interviewed (16 June–19 October 2010; eAppendix [Qualitative survey], http://links.lww.com/EDE/A529, also N.M., unpublished data, 2011).
The target population was well connected. Data from the respondent-driven sampling and simple random sampling surveys showed that at least 73% were linked in a single network (eAppendix [Methods], http://links.lww.com/EDE/A529). The distribution of the reported network size of respondent-driven sampling recruits, based on the primary definition of network size (NS-1), was approximately normal but with a slight positive skew, and shows likely overreporting of multiples of 5 (eFigure 1, http://links.lww.com/EDE/A529; excluding seeds). The distributions of the other network size measures (as defined in the eAppendix [Methods], http://links.lww.com/EDE/A529) were very similar, with the exception of definition NS-5, which showed a smaller proportion of larger network sizes because it was a subset of NS-4 (eFigure 2, http://links.lww.com; including seeds). Pearson correlations between different network size definitions reported by respondent-driven sampling recruits varied between 0.96 (NS-1 vs. NS-2) and 0.75 (NS-1 vs. NS-5) (eTable 2, http://links.lww.com/EDE/A529; including seeds). The mean network size (NS-1) of respondent-driven sampling recruits (including seeds) was higher than that of the whole target population (12.1 vs. 9.2, P < 0.001) (eFigure 3, http://links.lww.com/EDE/A529). The number of times members of the target population were reported to be in the network of recruits ranged between 0 and 42 (eFigure 4, http://links.lww.com/EDE/A529).
There was high within-group recruitment (homophily) by religion, tribe, and village and in the highest socioeconomic status groups, but not by age, sexual activity, or HIV status, or within the other socioeconomic status groups, (eTables 3 and 4, http://links.lww.com/EDE/A529). There was no evidence of low within-group recruitment for any characteristic, ie, preferentially recruiting men who differed from themselves. Comparing actual recruitment proportions with expected recruitment proportions calculated from individual-level network data, there was evidence of nonrandom recruitment by age, tribe, socioeconomic status, village, and sexual activity (eAppendix [Supporting results “Recruitment pattern” section and eTables 5 and 6], http://links.lww.com/EDE/A529).
The other RDS-2 estimator assumptions8 were not met. In common with current practice for all respondent-driven-sampling studies, respondents were not limited to recruiting only one other person, and recruited persons were ineligible for rerecruitment. It is likely that only a low proportion of the relationships between members of the target population were reciprocated and/or the population may not have accurately reported their network size, as only 30% of recruits were mentioned by their recruiter during the recruiter's first interview.
Comparison With Target Population Data
The Table shows the comparison between the population proportions, sample proportions, and RDS-1 and RDS-2 estimates, for the full and small sample. The sample proportions were often similar to population proportions, with the following exceptions. In both samples, younger men (<30 years) were underrepresented and older men (≥40 years) were overrepresented. In the small sample, Catholics were overrepresented. In both samples, men in the highest socioeconomic group were underrepresented, and men in the lowest socioeconomic group were overrepresented. The proportions of men with unknown numbers of sexual partners or unknown HIV status were underrepresented in both samples. It is unlikely that the differences between the population and sample proportions occurred by chance (P ≤ 0.0001 for all except P = 0.04 for the highest socioeconomic status group using the small sample).
Respondent-driven sampling inference methods generally failed to reduce bias where it occurred. Adjustment resulted in an improved estimate of the population proportion in only 37% (19/52) of comparisons using RDS-1 and 33% (15/52) using RDS-2 for the full sample, and 31% (8/26) using RDS-1 and 37% (18/49) using RDS-2 for the small sample. Based on these estimates, the 95% bootstrap confidence intervals included the target population proportion in 69% (36/52) of comparisons using RDS-1 and 50% (13/26) using RDS-2 for the full sample, and 69% (18/26) using RDS-1 and 74% using RDS-2 for the small sample.
The root mean squared error for the difference between the population proportions and the sample proportions was 6% for the full sample. The root mean squared error for the difference between the population proportions and the respondent-driven sampling estimates for the full sample were 7% for both RDS-1 and RDS-2 (eTable 7, http://links.lww.com/EDE/A529). Root mean squared errors were slightly larger for the small sample.
In general, if the respondent-driven sampling adjustments did not improve the estimates, the adjustments were small and did not add substantial bias. The exception to this was the variable village. Due to the large number of subgroups for “village,” however, the sample size was not sufficiently large to reliably estimate the parameters used to make RDS-1 adjustments.
In comparison, using the predictions from the logistic regression models as recruitment probability weights, adjustment improved the estimate of the target population proportion for 88% (46/52) of the full sample estimates, and for 57% (28/49) of the small sample estimates (eTable 6, http://links.lww.com/EDE/A529), showing that recruitment was associated with characteristics other than network size.
For specific cases in which the sample estimates of population proportions were biased, current respondent-driven sampling inference methods generally failed to reduce bias. For age group, using either the RDS-1 or the RDS-2 estimator, only 2 of 5 estimates were closer to the population proportion when applied to the full sample, and only 1 of 4 estimates was closer when applied to the small sample. Neither RDS-1 nor RDS-2 improved the overrepresentation of Catholics in the small sample, the overrepresentation of the lowest socioeconomic group in the full sample, the underrepresentation of the highest socioeconomic group in either sample, or the underrepresentation of men with unknown number of sexual partners in either sample. Applying RDS-2 to the full sample very slightly reduced the underrepresentation of men with unknown HIV status. Applying RDS-2 to the small sample or RDS-1 to either sample slightly increased the underrepresentation of men with unknown HIV status.
Respondent-driven-sampling inference methods failed to reduce bias because groups tended to be under- or overrecruited by all groups, rather than being underrecruited by some groups and overrecruited by other groups (limiting the ability of RDS-1 to improve estimates), and because underrepresented groups tended not to have markedly smaller network sizes (limiting the ability of RDS-1 and RDS-2). For example, men aged ≥50 years were overrecruited by all age groups, and network sizes in all age groups were relatively similar (eTable 3, http://links.lww.com/EDE/A529). Therefore, neither RDS-1 nor RDS-2 improved the estimates.
Qualitative data suggested possible explanations for these findings. Recruiters did not consider younger unmarried men to be household heads, in contrast with the definition used in the ongoing general population cohort (“… they were being left out because some of the older men didn't take them as household heads because they didn't have any wives” [45-year-old respondent-driven sampling recruit]). The respondent-driven sampling incentives were likely to be a greater incentive to men in lower socioeconomic groups (“… the token might look small to some people and big to others.” [42-year-old female community member]). The underrecruitment of men with unknown number of sexual partners or unknown HIV status was likely, at least in part, to be because men who had refused to participate in the ongoing general population cohort in the past were also less likely to participate in the respondent-driven sampling study.
There was very little difference in the performance of the respondent-driven sampling estimators when different network size definitions were used (eAppendix [Results], http://links.lww.com/EDE/A529). There was no evidence that collecting detailed network size data reduced the performance of the respondent-driven sampling estimators (eAppendix [Results], http://links.lww.com/EDE/A529).
In our study, recruitment by respondent-driven sampling produced a largely representative sample of the target population for most variables. The exceptions were an underrepresentation of men who were younger, men of higher socioeconomic status, men of unknown HIV status, and men with unknown number of sexual partners in both samples, and an overrepresentation of Catholics in the small sample. The most plausible reason for sample bias by age is that younger men were not considered to be heads of household. The most plausible reason for sample bias by socioeconomic status is that men of higher status were less attracted by the incentives. Men who refused to participate in the ongoing general population cohort were probably more likely to also have refused to participate in respondent-driven sampling and that was probably at least partially responsible for the underrecruitment of men of unknown HIV status or with an unknown number of sexual partners. These biases may increase the design effect of respondent-driven sampling. Neither of the respondent-driven sampling inference methods was designed to correct for these sources of bias.
The bias in recruitment by socioeconomic status is likely to be generalizable to most, if not all, respondent-driven sampling studies because different subgroups of the target population are likely to be differentially motivated by whatever incentives are offered. An “unknown” category for HIV status and other variables will not exist in most other respondent-driven sampling studies. The differential recruitment of persons in the population by willingness to participate in surveys is nevertheless likely to be a generalizable finding, but it is not limited to respondent-driven sampling. However, it is especially difficult to estimate the size of this bias using respondent-driven sampling data, as information on people who refuse to participate can be obtained only indirectly from the subset of recruiters who return to collect their secondary incentives. The bias in recruitment by age may not exist in other respondent-driven sampling studies, but this finding does highlight the challenge created when the community understands a definition of target-group membership differently from the researcher. As in this case, the bias may be quite subtle and difficult to detect. Quantification of the size of the bias would require triangulation with other sources of quantitative data, and the explanation for the bias may become clear only with qualitative data.
Overall, the sample proportions were closer to the population proportions than were the respondent-driven sampling estimates >60% of the time, for both sample sizes. Both RDS-1 and RDS-2 adjustments slightly increased the total root mean squared error compared with the sample proportions. The overall failure of the respondent-driven sampling inference methods to reduce bias is probably because the assumptions behind the respondent-driven sampling method were not met, and so the methods imperfectly accounted for the patterns of recruitment between subgroups (RDS-1) and differences in network size (RDS-1 and RDS-2). Recruitment was associated with characteristics other than network size. It is surprising that respondent-driven sampling inference methods increased bias more often than not. This occurred because when the respondent-driven sampling adjustments were in the right direction, they often greatly overcompensated. That is, the magnitude of the adjustment was often more than twice the size of the bias, so that after adjustment the respondent-driven sampling estimate was even further away from the population proportion in the other direction.
The reason that the 95% confidence intervals included the population proportions substantially <95% of the time may be due either to the fact that the CIs are too narrow (as has been suggested in another study9) or because the respondent-driven sampling estimates were biased, or both.
There are at least 4 potential limitations to our study. First, empirical evaluation of respondent-driven sampling is problematic. The representative or total population data that are required for robust evaluation are generally unavailable on the hidden and stigmatized groups that respondent-driven sampling is most commonly used to survey. We evaluated respondent-driven sampling in a nonhidden/nonstigmatized population of male household heads because of the availability of high-quality total population data. This may limit the generalizability of our results. However, it may also be a best-case scenario for an empirical evaluation of respondent-driven sampling. Respondent-driven sampling data on hidden and stigmatized populations may suffer from higher levels of bias than our sample. If respondent-driven sampling estimators are as unsuccessful at reducing this bias as our findings suggest, then estimates on hidden populations may be even less representative than ours.
Second, the findings of this study are based on only one respondent-driven sampling sample, and the biases that we observed in the sample proportions could have arisen by chance. However, the differences between the population and sample proportions were highly unlikely to have occurred by chance (P ≤ 0.0001 for all differences except the underrepresentation of men in the highest socioeconomic group, where P = 0.04). In addition, in each case where we identified a likely bias, the qualitative data suggested a plausible reason why the bias occurred.
Third, although we ordered the network-size questions so that the first to be asked was similar to the question asked in most respondent-driven sampling studies,25,26 statements made by respondent-driven sampling interviewers during the qualitative study suggested that the more detailed network questions may have caused later recruits to underreport network size so that the interview could be conducted in less time. However, sensitivity analysis found no evidence that collecting detailed network data reduced the performance of the respondent-driven sampling estimators. Therefore, we believe that our results and conclusions are robust to this potential limitation.
Finally, our decision not to offer all recruits the chance to recruit others, to slow the rate of recruitment, could have biased the results. However, in general, the respondent-driven sampling sample estimates were representative of the population proportions, and where they were not, plausible explanations were identified for these biases. Our results and conclusions are therefore likely to be robust to this limitation as well.
In line with other studies, our study showed that respondent-driven sampling was an effective data-collection method.10,32 However, our data suggest that the current respondent-driven sampling statistical-inference methods can fail, and the confidence intervals may be too narrow. Whether the data required to reliably remove bias and measure precision can be collected in a respondent-driven sampling survey is unresolved. Respondent-driven sampling should be regarded as a (potentially superior) form of convenience sampling method, and caution is required when interpreting respondent-driven sampling study findings.
Further empirical studies should investigate the size of biases in respondent-driven sampling studies in other populations, particularly in those rare examples of hidden/stigmatized populations on which representative data might be available. In addition, the effect of these biases on both simple and adjusted estimates should be investigated using simulations of respondent-driven sampling recruitment, and theoretical work should attempt to develop improved point and interval estimators.
We thank the study participants and staff at the MRC/UVRI Uganda Research Unit on AIDS and Alice Martineau, without whom this study would not have been possible.
1. Anderson R, May R. Infectious Diseases of Humans: Dynamics and Control. Oxford: Oxford University Press, 1991.
2. Magnani R, Sabin K, Saidel T, Heckathorn D. Review of sampling hard-to-reach and hidden populations for HIV surveillance. AIDS. 2005;19(suppl 2):S67–S72.
3. Heckathorn DD. Respondent-driven sampling: a new approach to the study of hidden populations. Soc Probl. 1997;44:174–199.
4. Salganik MJ, Heckathorn DD. Sampling and estimation in hidden populations using respondent-driven sampling. Socio Meth. 2004;34:193–240.
5. Heckathorn DD. Respondent-driven sampling II: deriving valid population estimates from chain-referral samples of hidden populations. Soc Probl. 2002;49:11–34.
6. Volz E, Wejnert C, Deganii l, Heckathorn D. Respondent-Driven Sampling Analysis Tool (RDSAT). 6.0.1 ed. Ithaca, NY: Cornell University, 2007.
7. Heckathorn DD. Extensions of respondent-driven sampling: analyzing continuous variables and controlling for differential recruitment. Socio Meth. 2007;37:151–207.
8. Volz E, Heckathorn D. Probability based estimation theory for respondent driven sampling. J Off Stat. 2008;24:79–97.
9. Goel S, Salganik MJ. Assessing respondent-driven sampling. Proc Natl Acad Sci. 107:6743–6747.
10. Malekinejad M, Johnston L, Kendall C, Kerr L, Rifkin M, Rutherford G. Using respondent-driven sampling methodology for HIV biological and behavioral surveillance in international settings: a systematic review. AIDS Behav. 2008;12(suppl 4):S105–S130.
11. Salganik MJ. Variance estimation, design effects, and sample size calculations for respondent-driven sampling. J Urban Health. 2006;83(6 suppl):i98–i112.
12. Gile KJ, Handcock MS. Respondent-driven sampling: an assessment of current methodology. Socio Meth. 2010;40:285–327.
13. Platt L, Wall M, Rhodes T. Methods to recruit hard-to-reach groups: comparing two chain referral sampling methods of recruiting injecting drug users across nine studies in Russia and Estonia. J Urban Health. 2006;83:39–53.
14. Robinson WT, Risser JM, McGoy S. Recruiting injection drug users: a three-site comparison of results and experiences with respondent-driven and targeted sampling procedures. J Urban Health. 2006;83(6 suppl):i29–i38.
15. Burt RD, Hagan H, Sabin K, Thiede H. Evaluating respondent-driven sampling in a major metropolitan area: comparing injection drug users in the 2005 Seattle area national HIV behavioral surveillance system survey with participants in the RAVEN and Kiwi studies. Ann Epidemiol. 2010;20:159–167.
16. Abdul-Quader AS, Heckathorn DD, McKnight C. Effectiveness of respondent-driven sampling for recruiting drug users in New York City: findings from a pilot study. J Urban Health. 2006;83:459–476.
17. Kendall C, Kerr L, Gondim RC. An empirical comparison of respondent-driven sampling, time location sampling, and snowball sampling for behavioral surveillance in men who have sex with men, Fortaleza, Brazil. AIDS Behav. 2008;12:97–104.
18. Johnston L, Trummal A, Lohmus L, Ravalepik A. Efficacy of convenience sampling through the internet versus respondent driven sampling among males who have sex with males in Tallinn and Harju County, Estonia: challenges reaching a hidden population. AIDS Care. 2009;211195.
19. Ramirez-Valles J, Heckathorn DD, Vazquez R, Diaz RM, Campbell RT. From networks to populations: the development and application of respondent-driven sampling among IDUs and Latino gay men. AIDS Behav. 2005;9:387–402.
20. Ma X, Zhang Q, He X. Trends in prevalence of HIV, syphilis, hepatitis C, hepatitis B, and sexual risk behavior among men who have sex with men. Results of 3 consecutive respondent-driven sampling surveys in Beijing, 2004 through 2006. J Acquir Immune Defic Syndr. 2007;45:581–587.
21. Wejnert C, Heckathorn DD. Web-Based network sampling: efficiency and efficacy of respondent-driven sampling for online research. Socio Meth Res. 2008;37:105–134.
22. Wejnert C. An empirical test of respondent-driven sampling: point estimates, variance, degree measures, and out-of-equilibrium data. Socio Meth. 2009;39:73–116.
23. Shafer LA, Biraro S, Nakiyingi-Miiro J. HIV prevalence and incidence are no longer falling in southwest Uganda: evidence from a rural population cohort 1989–2005. AIDS. 2008;22:1641–1649.
24. Kamali A, Carpenter LM, Whitworth JA, Pool R, Ruberantwari A, Ojwiya A. Seven-year trends in HIV-1 infection rates, and changes in sexual behaviour, among adults in rural Uganda. AIDS. 2000;14:427–434.
25. McCarty C, Killworth PD, Bernard HR, Johnsen EC, Shelley GA. Comparing two methods for estimating network size. Hum Org. 2001;60:28–39.
26. McCormick T, Salganik M, Zheng T. How many people do you know?: Efficiently estimating personal network size. J Am Stat Assoc. 2010;105:59–70.
27. StataCorp. Stata Statistical Software: Release 11.0. 9 ed. College Station, Texas: Stata Press; 2010.
28. R Development Core Team. R language and environment for statistical computing and graphics Vienna, Austria: R Foundation for Statistical Computing, Available at: http://www.R-project.org
29. Gansner ER, North SC. An open graph visualization system and its applications to software engineering. Softw Pract Exper. 1999;S1:1–5.
30. Kirkwood BR, Sterne JA. Essential medical statistics. Oxford, UK: Wiley-Blackwell; 2003.
31. McCreesh N, Johnston L, Copas A, et al.. Evaluation of the role of location and distance in recruitment in respondent-driven sampling. International Journal of Health Geographics 2011;10(1):56.
32. Frost SD, Brouwer KC, Firestone Cruz MA. Respondent-driven sampling of injection drug users in two U.S.-Mexico border cities: recruitment dynamics and impact on estimates of HIV and syphilis prevalence. J Urban Health. 2006;83(6 suppl):i83–i97.
Supplemental Digital Content
Copyright © 2012 Wolters Kluwer Health, Inc. All rights reserved.