Evaluation of Respondent-driven Sampling : Epidemiology

Secondary Logo

Journal Logo


Evaluation of Respondent-driven Sampling

McCreesh, Nicky; Frost, Simon D. W.; Seeley, Janet; Katongole, Joseph; Tarsh, Matilda N.; Ndunguse, Richard; Jichi, Fatima; Lunel, Natasha L.; Maher, Dermot; Johnston, Lisa G.; Sonnenberg, Pam; Copas, Andrew J.; Hayes, Richard J.; White, Richard G.

Author Information
Epidemiology 23(1):p 138-147, January 2012. | DOI: 10.1097/EDE.0b013e31823ac17c



Respondent-driven sampling is a novel variant of link-tracing sampling for estimating the characteristics of hard-to-reach groups, such as HIV prevalence in sex workers. Despite its use by leading health organizations, the performance of this method in realistic situations is still largely unknown. We evaluated respondent-driven sampling by comparing estimates from a respondent-driven sampling survey with total population data.


Total population data on age, tribe, religion, socioeconomic status, sexual activity, and HIV status were available on a population of 2402 male household heads from an open cohort in rural Uganda. A respondent-driven sampling (RDS) survey was carried out in this population, using current methods of sampling (RDS sample) and statistical inference (RDS estimates). Analyses were carried out for the full RDS sample and then repeated for the first 250 recruits (small sample).


We recruited 927 household heads. Full and small RDS samples were largely representative of the total population, but both samples underrepresented men who were younger, of higher socioeconomic status, and with unknown sexual activity and HIV status. Respondent-driven sampling statistical inference methods failed to reduce these biases. Only 31%–37% (depending on method and sample size) of RDS estimates were closer to the true population proportions than the RDS sample proportions. Only 50%–74% of respondent-driven sampling bootstrap 95% confidence intervals included the population proportion.


Respondent-driven sampling produced a generally representative sample of this well-connected nonhidden population. However, current respondent-driven sampling inference methods failed to reduce bias when it occurred. Whether the data required to remove bias and measure precision can be collected in a respondent-driven sampling survey is unresolved. Respondent-driven sampling should be regarded as a (potentially superior) form of convenience sampling method, and caution is required when interpreting findings based on the sampling method.

Hidden or hard-to-reach population subgroups are often key to the maintenance of infectious diseases in human populations.1 However, it is often difficult to investigate the factors that drive transmission in these groups by using representative samples because there may not be an adequate sampling frame or the groups may be associated with illicit activity or subject to stigma. Researchers have therefore typically resorted to various types of convenience sampling to gather data on hidden populations.2 Although convenience sampling has its advantages, this approach is unable to generate unbiased population-based estimates of infection prevalence and risk factors.

In an attempt to address these limitations, respondent-driven sampling (a variant of a link-tracing design) was proposed in 1997.3 With this approach, a small number of “seed” respondents are selected by convenience sampling or other methods. Then, these initial recruits are given coupons (typically, 3 coupons) to recruit others from the target population, who in turn become recruiters. Recruits are given an incentive (usually money) for taking part in the survey and then for recruiting others. This process continues in recruitment “waves” until a predetermined sample size is reached, or until the distribution of participant characteristics (such as the proportion infected) becomes similar between waves (called “reaching equilibrium” in respondent-driven sampling terminology). Estimation methods are then applied to account for the nonrandom sample selection in an attempt to generate unbiased estimates for the target population. For this approach to be successful, the target population must be socially well connected.

Two main estimation methods are generally used. The RDS-1 estimator, currently in wide use, can be implemented with the standard respondent-driven sampling analysis software.36 RDS-1 accounts for patterns of recruitment between subgroups and the average number of other members of the target group who the recruiters know (the “network size”) in each subgroup.5,7 RDS-2 is a more recently developed estimator that relates respondent-driven sampling estimation to widely used survey estimation through the use of a generalized Horvitz–Thompson estimator.8 RDS-2 accounts for network size only.8 Initial theoretical analysis has asserted that the RDS-2 estimator is asymptotically unbiased as long as 6 key assumptions are met, including that respondents accurately report the size of their “network” (the number of other members of the target group they know), that respondents randomly recruit from their network, and that respondents have reciprocal relationships with members of the target population.8

A recent study simulating respondent-driven sampling and using empirical network data found that the variance of respondent-driven sampling estimates can be much higher than commonly assumed.9 Nevertheless, respondent-driven sampling has rapidly become a popular and widely used survey method. Outside of the United States, >120 respondent-driven sampling studies, involving >30,000 participants, have been published,10 and respondent-driven sampling is currently being used to provide data for public-health decision-making by major funding bodies such as the US Centers for Disease Control and Prevention.

Despite its popularity, it is not known whether respondent-driven sampling can generate unbiased estimates. This is primarily because the robust evaluation of respondent-driven sampling is methodologically challenging. By definition, gold-standard representative or total population data are generally unavailable or are of poor quality for hidden/stigmatized groups. Other methods of evaluation that have been attempted include in silico studies,4,8,11,12 comparisons of respondent-driven-sampling data with data from other convenience samples,1319 comparisons of serial cross-sectional respondent-driven sampling estimates on the same population over time,20 and comparisons of Internet-collected respondent-driven sampling data with a population that has known characteristics.21,22 Although all these studies have provided valuable information on respondent-driven sampling, none provides a robust assessment of whether respondent-driven sampling could produce unbiased estimates—either because the required gold-standard comparison population was unavailable or because an Internet-based respondent-driven sampling data collection method was used, whereas most respondent-driven sampling studies use face-to-face data collection.10

We evaluated respondent-driven sampling by comparing field-collected respondent-driven sampling data with total-population data on the same population. Although the representative or total-population data required for such a comparison are generally unavailable, we dealt with this problem by evaluating respondent-driven sampling in a nonhidden/nonstigmatized population for which high-quality total-population data were available. This also allowed us to perform a range of analyses that are not possible in typical respondent-driven sampling studies.


To evaluate whether respondent-driven sampling can generate representative data, we compared estimates from a respondent-driven sampling survey of a rural Ugandan population with total population data. The data used to define the target population were available from an ongoing general population cohort of 25 villages in rural Uganda covering an area of approximately 38 km23,24 (Fig. 1). Each year, households in the study villages are mapped and, after obtaining consent, a total population household census and an individual questionnaire and HIV-1 serosurvey are administered. The target population consisted of 2402 men who were recorded as a male head of a household within these villages between February 2009 and January 2010 (Fig. 1). The characteristics of the target population are shown in the Table (population proportion).

Figure 1:
Map of study area showing location of target population and seed households and respondent-driven sampling interview sites. Colors are used to represent households in different villages. Each village has been labeled with a letter for confidentiality.
Population Proportions, Sample Proportions, and RDS-1 and RDS-2 Estimates With 95% Confidence Intervals (CIs), for the Full and Small Samples
Population Proportions, Sample Proportions, and RDS-1 and RDS-2 Estimates With 95% Confidence Intervals (CIs), for the Full and Small Samples

To maximize the generalizability of our results, we applied currently used respondent-driven sampling data collection methods wherever possible.10 Ten “seeds” (of varying village, age and tribe) were selected by convenience from the target population. Figure 1 shows their locations and eTable 1 (https://links.lww.com/EDE/A529) summarizes their characteristics. Seeds and subsequent recruits were given 3 coupons to recruit other men into the study. The rate of early recruitment was high, and the number of people arriving each day for interviews became too large to be manageable. Because of this, between days 9 and 32, the probability of each recruit being offered 3 coupons was halved from 100% to 50%; other recruits received none. As incentives for participation and recruitment, seeds and recruits were offered soap, salt, or school books to the value of approximately US $1. One incentive was offered for completing the first interview and another for each person successfully recruited.

Respondent-driven sampling estimation requires information on how many other household heads each participant could potentially recruit. The primary network-size definition was created to be comparable with other respondent-driven sampling studies25,26 and was used here unless otherwise stated. Recruits were first asked the core question “How many men do you know who (i) were head of a household in the last 12 months in any of the Medical Research Council villages, and (ii) you know them and they know you, and (iii) you have seen them in the past week.” More detailed network data were also collected (eAppendix [methods], https://links.lww.com/EDE/A529).

Preprocessing of data was performed using Stata v11 (StataCorp, TX).27 Networks and “trees” were generated using scripts written in Stata and R v2.12.0 (R Foundation, Vienna, Austria)28 and visualized using GraphViz (AT&T Research, NJ).29 To maximize the comparability of our methods with those used in a typical respondent-driven sampling study, we analyzed the dataset following current respondent-driven sampling definitions and the statistical inference methods used in RDSAT v6.0.1, the custom-written software package for the analysis of respondent-driven sampling studies.6 (ie, the RDS-1 point estimator35 and the bootstrap 95% confidence interval [CI] estimator11). We also analyzed the dataset using the more recently developed point estimator RDS-2 and the same bootstrap 95% CI estimator,11 employing R. Simple respondent-driven sampling sample proportions and respondent-driven sampling estimates were calculated for 2 sample sizes. The first was the “Full” sample (n = 927 including the 10 seeds). The second was a “Small” sample consisting of the first 250 recruits (including the 10 seeds); this was chosen to be more typical of the sample sizes used in respondent-driven sampling studies.10

Root mean squared errors were calculated for the differences between the population proportions and the full and small sample proportions, and for the differences between the population proportions and the RDS-1 and RDS-2 estimates, for each variable and in total. For comparison with the RDS-1 and RDS-2 estimates, we used the true population proportions to calculate recruitment probabilities for the target population using predictions from a logistic regression model30 as weights. The variables shown in the Table were included in the model if they were significant at the P < 0.05 level.

Sensitivity analyses were used to assess the robustness of our results to various network size definitions, potential network-size bias, and respondent-driven sampling sample size.

To compare network size of the whole target population with the respondent-driven sampling recruits, 300 men in the target population who had not been recruited in the respondent-driven sampling study were selected using simple random sampling to be interviewed using the first respondent-driven sampling questionnaire. Mean network size of the whole target population was estimated as the weighted average of the mean network size of respondent-driven sampling recruits and the mean network size of a simple random sample of eligible nonrecruits. T tests were used to test for differences between means. To help understand the quantitative study findings, 54 members of the population in the study villages and Medical Research Council staff were selected using random or purposive sampling for qualitative interview. Full details are shown in the eAppendix (Methods, https://links.lww.com/EDE/A529).



The dynamics of the respondent-driven sampling survey recruitment are shown in Figure 2, and the recruitment networks from each seed are shown in Figure 3. A total of 1141 people (including the 10 seeds) were assessed for eligibility during a period of 54 days (8 March–30 April 2010). No new coupons were distributed after day 47. One hundred ninety-six men attended but were ineligible; 16 were eligible but had already been recruited; 2 were eligible but did not give consent; and 927 were eligible, consented, and were recruited. A video illustrating recruitment in space and time is provided online (https://links.lww.com/EDE/A542), and recruitment in space is explored more fully in another paper by N.M.31 An approximately linear recruitment rate was achieved in the respondent-driven sampling survey (Fig. 2A), due, in part, to changes in the probability of each recruit being offered coupons during the survey. All 10 seeds recruited people into the study, with one seed recruiting one person, 4 recruiting 2 people, and 5 recruiting 3 people. The total number of recruits originating from each seed ranged from 8 to 241 (1%–26% of the full sample) (Fig. 2B). In all, 77% of the total recruitment was from 4 seeds. Full details of the seeds and recruitment by seed are given in eTable 1 (https://links.lww.com/EDE/A529). The number of waves ranged from 3 to 16 for the full sample and from 2 to 6 for the small sample The highest recruitment occurred in wave 5 (12% of all recruits, excluding seeds), and 57% of recruitments occurred in waves 4–8 (Fig. 2C). In all, 81% of recruits (including the recruits of seeds) were interviewed within 7 days of their recruiter's interview (Fig. 2D).

Figure 2:
Summary of the dynamics of respondent-driven sampling survey recruitment. A, The cumulative number of recruits over time (including seeds). B, The total number of recruits per seed (excluding seeds). C, The number of recruits by wave and seed (including seeds). D, The number of days between recruiters' interview and their recruits' first interview. E, The number of recruits per recruiter, overall, and by whether the recruiters returned for incentive collection (including seeds). F, The proportion of recruit's network who had already been recruited at the time of their interview (using network size definition NS-5, including seeds).
Figure 3:
Recruitment networks showing HIV infection status, by seed. Seeds are shown at the top of each recruitment network. Symbol area is proportional to network size. HIV serostatus is shown by shading: black indicates HIV-positive; white, HIV-negative; gray, HIV status unknown. HIV status omitted for seeds for confidentiality.

Overall, 75% of recruits (including seeds) (n = 684) were offered coupons to recruit others, and of these, 90% (n = 612) accepted (called “recruiters”). Sixty-six percent of recruiters (n = 401) returned to take part in a second interview and to collect their secondary incentives. A similar proportion of recruiters (including seeds) recruited 0, 1, 2, or 3 recruits (Fig. 2E, left bar). Recruits who returned to collect secondary incentives were more likely to have recruited (Fig. 2E, middle and right bar). The proportion of the recruit's network already recruited at the time of their interview increased rapidly during the survey (Fig. 2F; includes seeds). The average number of recruits per recruiter (including seeds) decreased from 2.6 in the first week of the study to 0.6 in the last week that coupons were given out. Only 30% of recruits had been named as a contact by their recruiter (and identified) at their recruiter's first interview.

In the simple random sample survey, 55% (164/300) of men selected were interviewed (4–28 May 2010; eAppendix [Simple random sample survey], https://links.lww.com/EDE/A529). In the qualitative survey, 98% (53/54) of people selected were interviewed (16 June–19 October 2010; eAppendix [Qualitative survey], https://links.lww.com/EDE/A529, also N.M., unpublished data, 2011).

The target population was well connected. Data from the respondent-driven sampling and simple random sampling surveys showed that at least 73% were linked in a single network (eAppendix [Methods], https://links.lww.com/EDE/A529). The distribution of the reported network size of respondent-driven sampling recruits, based on the primary definition of network size (NS-1), was approximately normal but with a slight positive skew, and shows likely overreporting of multiples of 5 (eFigure 1, https://links.lww.com/EDE/A529; excluding seeds). The distributions of the other network size measures (as defined in the eAppendix [Methods], https://links.lww.com/EDE/A529) were very similar, with the exception of definition NS-5, which showed a smaller proportion of larger network sizes because it was a subset of NS-4 (eFigure 2, https://links.lww.com; including seeds). Pearson correlations between different network size definitions reported by respondent-driven sampling recruits varied between 0.96 (NS-1 vs. NS-2) and 0.75 (NS-1 vs. NS-5) (eTable 2, https://links.lww.com/EDE/A529; including seeds). The mean network size (NS-1) of respondent-driven sampling recruits (including seeds) was higher than that of the whole target population (12.1 vs. 9.2, P < 0.001) (eFigure 3, https://links.lww.com/EDE/A529). The number of times members of the target population were reported to be in the network of recruits ranged between 0 and 42 (eFigure 4, https://links.lww.com/EDE/A529).

There was high within-group recruitment (homophily) by religion, tribe, and village and in the highest socioeconomic status groups, but not by age, sexual activity, or HIV status, or within the other socioeconomic status groups, (eTables 3 and 4, https://links.lww.com/EDE/A529). There was no evidence of low within-group recruitment for any characteristic, ie, preferentially recruiting men who differed from themselves. Comparing actual recruitment proportions with expected recruitment proportions calculated from individual-level network data, there was evidence of nonrandom recruitment by age, tribe, socioeconomic status, village, and sexual activity (eAppendix [Supporting results “Recruitment pattern” section and eTables 5 and 6], https://links.lww.com/EDE/A529).

The other RDS-2 estimator assumptions8 were not met. In common with current practice for all respondent-driven-sampling studies, respondents were not limited to recruiting only one other person, and recruited persons were ineligible for rerecruitment. It is likely that only a low proportion of the relationships between members of the target population were reciprocated and/or the population may not have accurately reported their network size, as only 30% of recruits were mentioned by their recruiter during the recruiter's first interview.

Comparison With Target Population Data

The Table shows the comparison between the population proportions, sample proportions, and RDS-1 and RDS-2 estimates, for the full and small sample. The sample proportions were often similar to population proportions, with the following exceptions. In both samples, younger men (<30 years) were underrepresented and older men (≥40 years) were overrepresented. In the small sample, Catholics were overrepresented. In both samples, men in the highest socioeconomic group were underrepresented, and men in the lowest socioeconomic group were overrepresented. The proportions of men with unknown numbers of sexual partners or unknown HIV status were underrepresented in both samples. It is unlikely that the differences between the population and sample proportions occurred by chance (P ≤ 0.0001 for all except P = 0.04 for the highest socioeconomic status group using the small sample).

Respondent-driven sampling inference methods generally failed to reduce bias where it occurred. Adjustment resulted in an improved estimate of the population proportion in only 37% (19/52) of comparisons using RDS-1 and 33% (15/52) using RDS-2 for the full sample, and 31% (8/26) using RDS-1 and 37% (18/49) using RDS-2 for the small sample. Based on these estimates, the 95% bootstrap confidence intervals included the target population proportion in 69% (36/52) of comparisons using RDS-1 and 50% (13/26) using RDS-2 for the full sample, and 69% (18/26) using RDS-1 and 74% using RDS-2 for the small sample.

The root mean squared error for the difference between the population proportions and the sample proportions was 6% for the full sample. The root mean squared error for the difference between the population proportions and the respondent-driven sampling estimates for the full sample were 7% for both RDS-1 and RDS-2 (eTable 7, https://links.lww.com/EDE/A529). Root mean squared errors were slightly larger for the small sample.

In general, if the respondent-driven sampling adjustments did not improve the estimates, the adjustments were small and did not add substantial bias. The exception to this was the variable village. Due to the large number of subgroups for “village,” however, the sample size was not sufficiently large to reliably estimate the parameters used to make RDS-1 adjustments.

In comparison, using the predictions from the logistic regression models as recruitment probability weights, adjustment improved the estimate of the target population proportion for 88% (46/52) of the full sample estimates, and for 57% (28/49) of the small sample estimates (eTable 6, https://links.lww.com/EDE/A529), showing that recruitment was associated with characteristics other than network size.

For specific cases in which the sample estimates of population proportions were biased, current respondent-driven sampling inference methods generally failed to reduce bias. For age group, using either the RDS-1 or the RDS-2 estimator, only 2 of 5 estimates were closer to the population proportion when applied to the full sample, and only 1 of 4 estimates was closer when applied to the small sample. Neither RDS-1 nor RDS-2 improved the overrepresentation of Catholics in the small sample, the overrepresentation of the lowest socioeconomic group in the full sample, the underrepresentation of the highest socioeconomic group in either sample, or the underrepresentation of men with unknown number of sexual partners in either sample. Applying RDS-2 to the full sample very slightly reduced the underrepresentation of men with unknown HIV status. Applying RDS-2 to the small sample or RDS-1 to either sample slightly increased the underrepresentation of men with unknown HIV status.

Respondent-driven-sampling inference methods failed to reduce bias because groups tended to be under- or overrecruited by all groups, rather than being underrecruited by some groups and overrecruited by other groups (limiting the ability of RDS-1 to improve estimates), and because underrepresented groups tended not to have markedly smaller network sizes (limiting the ability of RDS-1 and RDS-2). For example, men aged ≥50 years were overrecruited by all age groups, and network sizes in all age groups were relatively similar (eTable 3, https://links.lww.com/EDE/A529). Therefore, neither RDS-1 nor RDS-2 improved the estimates.

Qualitative data suggested possible explanations for these findings. Recruiters did not consider younger unmarried men to be household heads, in contrast with the definition used in the ongoing general population cohort (“… they were being left out because some of the older men didn't take them as household heads because they didn't have any wives” [45-year-old respondent-driven sampling recruit]). The respondent-driven sampling incentives were likely to be a greater incentive to men in lower socioeconomic groups (“… the token might look small to some people and big to others.” [42-year-old female community member]). The underrecruitment of men with unknown number of sexual partners or unknown HIV status was likely, at least in part, to be because men who had refused to participate in the ongoing general population cohort in the past were also less likely to participate in the respondent-driven sampling study.

There was very little difference in the performance of the respondent-driven sampling estimators when different network size definitions were used (eAppendix [Results], https://links.lww.com/EDE/A529). There was no evidence that collecting detailed network size data reduced the performance of the respondent-driven sampling estimators (eAppendix [Results], https://links.lww.com/EDE/A529).


In our study, recruitment by respondent-driven sampling produced a largely representative sample of the target population for most variables. The exceptions were an underrepresentation of men who were younger, men of higher socioeconomic status, men of unknown HIV status, and men with unknown number of sexual partners in both samples, and an overrepresentation of Catholics in the small sample. The most plausible reason for sample bias by age is that younger men were not considered to be heads of household. The most plausible reason for sample bias by socioeconomic status is that men of higher status were less attracted by the incentives. Men who refused to participate in the ongoing general population cohort were probably more likely to also have refused to participate in respondent-driven sampling and that was probably at least partially responsible for the underrecruitment of men of unknown HIV status or with an unknown number of sexual partners. These biases may increase the design effect of respondent-driven sampling. Neither of the respondent-driven sampling inference methods was designed to correct for these sources of bias.

The bias in recruitment by socioeconomic status is likely to be generalizable to most, if not all, respondent-driven sampling studies because different subgroups of the target population are likely to be differentially motivated by whatever incentives are offered. An “unknown” category for HIV status and other variables will not exist in most other respondent-driven sampling studies. The differential recruitment of persons in the population by willingness to participate in surveys is nevertheless likely to be a generalizable finding, but it is not limited to respondent-driven sampling. However, it is especially difficult to estimate the size of this bias using respondent-driven sampling data, as information on people who refuse to participate can be obtained only indirectly from the subset of recruiters who return to collect their secondary incentives. The bias in recruitment by age may not exist in other respondent-driven sampling studies, but this finding does highlight the challenge created when the community understands a definition of target-group membership differently from the researcher. As in this case, the bias may be quite subtle and difficult to detect. Quantification of the size of the bias would require triangulation with other sources of quantitative data, and the explanation for the bias may become clear only with qualitative data.

Overall, the sample proportions were closer to the population proportions than were the respondent-driven sampling estimates >60% of the time, for both sample sizes. Both RDS-1 and RDS-2 adjustments slightly increased the total root mean squared error compared with the sample proportions. The overall failure of the respondent-driven sampling inference methods to reduce bias is probably because the assumptions behind the respondent-driven sampling method were not met, and so the methods imperfectly accounted for the patterns of recruitment between subgroups (RDS-1) and differences in network size (RDS-1 and RDS-2). Recruitment was associated with characteristics other than network size. It is surprising that respondent-driven sampling inference methods increased bias more often than not. This occurred because when the respondent-driven sampling adjustments were in the right direction, they often greatly overcompensated. That is, the magnitude of the adjustment was often more than twice the size of the bias, so that after adjustment the respondent-driven sampling estimate was even further away from the population proportion in the other direction.

The reason that the 95% confidence intervals included the population proportions substantially <95% of the time may be due either to the fact that the CIs are too narrow (as has been suggested in another study9) or because the respondent-driven sampling estimates were biased, or both.

There are at least 4 potential limitations to our study. First, empirical evaluation of respondent-driven sampling is problematic. The representative or total population data that are required for robust evaluation are generally unavailable on the hidden and stigmatized groups that respondent-driven sampling is most commonly used to survey. We evaluated respondent-driven sampling in a nonhidden/nonstigmatized population of male household heads because of the availability of high-quality total population data. This may limit the generalizability of our results. However, it may also be a best-case scenario for an empirical evaluation of respondent-driven sampling. Respondent-driven sampling data on hidden and stigmatized populations may suffer from higher levels of bias than our sample. If respondent-driven sampling estimators are as unsuccessful at reducing this bias as our findings suggest, then estimates on hidden populations may be even less representative than ours.

Second, the findings of this study are based on only one respondent-driven sampling sample, and the biases that we observed in the sample proportions could have arisen by chance. However, the differences between the population and sample proportions were highly unlikely to have occurred by chance (P ≤ 0.0001 for all differences except the underrepresentation of men in the highest socioeconomic group, where P = 0.04). In addition, in each case where we identified a likely bias, the qualitative data suggested a plausible reason why the bias occurred.

Third, although we ordered the network-size questions so that the first to be asked was similar to the question asked in most respondent-driven sampling studies,25,26 statements made by respondent-driven sampling interviewers during the qualitative study suggested that the more detailed network questions may have caused later recruits to underreport network size so that the interview could be conducted in less time. However, sensitivity analysis found no evidence that collecting detailed network data reduced the performance of the respondent-driven sampling estimators. Therefore, we believe that our results and conclusions are robust to this potential limitation.

Finally, our decision not to offer all recruits the chance to recruit others, to slow the rate of recruitment, could have biased the results. However, in general, the respondent-driven sampling sample estimates were representative of the population proportions, and where they were not, plausible explanations were identified for these biases. Our results and conclusions are therefore likely to be robust to this limitation as well.

In line with other studies, our study showed that respondent-driven sampling was an effective data-collection method.10,32 However, our data suggest that the current respondent-driven sampling statistical-inference methods can fail, and the confidence intervals may be too narrow. Whether the data required to reliably remove bias and measure precision can be collected in a respondent-driven sampling survey is unresolved. Respondent-driven sampling should be regarded as a (potentially superior) form of convenience sampling method, and caution is required when interpreting respondent-driven sampling study findings.

Further empirical studies should investigate the size of biases in respondent-driven sampling studies in other populations, particularly in those rare examples of hidden/stigmatized populations on which representative data might be available. In addition, the effect of these biases on both simple and adjusted estimates should be investigated using simulations of respondent-driven sampling recruitment, and theoretical work should attempt to develop improved point and interval estimators.


We thank the study participants and staff at the MRC/UVRI Uganda Research Unit on AIDS and Alice Martineau, without whom this study would not have been possible.


1. Anderson R, May R. Infectious Diseases of Humans: Dynamics and Control. Oxford: Oxford University Press, 1991.
2. Magnani R, Sabin K, Saidel T, Heckathorn D. Review of sampling hard-to-reach and hidden populations for HIV surveillance. AIDS. 2005;19(suppl 2):S67–S72.
3. Heckathorn DD. Respondent-driven sampling: a new approach to the study of hidden populations. Soc Probl. 1997;44:174–199.
4. Salganik MJ, Heckathorn DD. Sampling and estimation in hidden populations using respondent-driven sampling. Socio Meth. 2004;34:193–240.
5. Heckathorn DD. Respondent-driven sampling II: deriving valid population estimates from chain-referral samples of hidden populations. Soc Probl. 2002;49:11–34.
6. Volz E, Wejnert C, Deganii l, Heckathorn D. Respondent-Driven Sampling Analysis Tool (RDSAT). 6.0.1 ed. Ithaca, NY: Cornell University, 2007.
7. Heckathorn DD. Extensions of respondent-driven sampling: analyzing continuous variables and controlling for differential recruitment. Socio Meth. 2007;37:151–207.
8. Volz E, Heckathorn D. Probability based estimation theory for respondent driven sampling. J Off Stat. 2008;24:79–97.
9. Goel S, Salganik MJ. Assessing respondent-driven sampling. Proc Natl Acad Sci. 107:6743–6747.
10. Malekinejad M, Johnston L, Kendall C, Kerr L, Rifkin M, Rutherford G. Using respondent-driven sampling methodology for HIV biological and behavioral surveillance in international settings: a systematic review. AIDS Behav. 2008;12(suppl 4):S105–S130.
11. Salganik MJ. Variance estimation, design effects, and sample size calculations for respondent-driven sampling. J Urban Health. 2006;83(6 suppl):i98–i112.
12. Gile KJ, Handcock MS. Respondent-driven sampling: an assessment of current methodology. Socio Meth. 2010;40:285–327.
13. Platt L, Wall M, Rhodes T. Methods to recruit hard-to-reach groups: comparing two chain referral sampling methods of recruiting injecting drug users across nine studies in Russia and Estonia. J Urban Health. 2006;83:39–53.
14. Robinson WT, Risser JM, McGoy S. Recruiting injection drug users: a three-site comparison of results and experiences with respondent-driven and targeted sampling procedures. J Urban Health. 2006;83(6 suppl):i29–i38.
15. Burt RD, Hagan H, Sabin K, Thiede H. Evaluating respondent-driven sampling in a major metropolitan area: comparing injection drug users in the 2005 Seattle area national HIV behavioral surveillance system survey with participants in the RAVEN and Kiwi studies. Ann Epidemiol. 2010;20:159–167.
16. Abdul-Quader AS, Heckathorn DD, McKnight C. Effectiveness of respondent-driven sampling for recruiting drug users in New York City: findings from a pilot study. J Urban Health. 2006;83:459–476.
17. Kendall C, Kerr L, Gondim RC. An empirical comparison of respondent-driven sampling, time location sampling, and snowball sampling for behavioral surveillance in men who have sex with men, Fortaleza, Brazil. AIDS Behav. 2008;12:97–104.
18. Johnston L, Trummal A, Lohmus L, Ravalepik A. Efficacy of convenience sampling through the internet versus respondent driven sampling among males who have sex with males in Tallinn and Harju County, Estonia: challenges reaching a hidden population. AIDS Care. 2009;211195.
19. Ramirez-Valles J, Heckathorn DD, Vazquez R, Diaz RM, Campbell RT. From networks to populations: the development and application of respondent-driven sampling among IDUs and Latino gay men. AIDS Behav. 2005;9:387–402.
20. Ma X, Zhang Q, He X. Trends in prevalence of HIV, syphilis, hepatitis C, hepatitis B, and sexual risk behavior among men who have sex with men. Results of 3 consecutive respondent-driven sampling surveys in Beijing, 2004 through 2006. J Acquir Immune Defic Syndr. 2007;45:581–587.
21. Wejnert C, Heckathorn DD. Web-Based network sampling: efficiency and efficacy of respondent-driven sampling for online research. Socio Meth Res. 2008;37:105–134.
22. Wejnert C. An empirical test of respondent-driven sampling: point estimates, variance, degree measures, and out-of-equilibrium data. Socio Meth. 2009;39:73–116.
23. Shafer LA, Biraro S, Nakiyingi-Miiro J. HIV prevalence and incidence are no longer falling in southwest Uganda: evidence from a rural population cohort 1989–2005. AIDS. 2008;22:1641–1649.
24. Kamali A, Carpenter LM, Whitworth JA, Pool R, Ruberantwari A, Ojwiya A. Seven-year trends in HIV-1 infection rates, and changes in sexual behaviour, among adults in rural Uganda. AIDS. 2000;14:427–434.
25. McCarty C, Killworth PD, Bernard HR, Johnsen EC, Shelley GA. Comparing two methods for estimating network size. Hum Org. 2001;60:28–39.
26. McCormick T, Salganik M, Zheng T. How many people do you know?: Efficiently estimating personal network size. J Am Stat Assoc. 2010;105:59–70.
27. StataCorp. Stata Statistical Software: Release 11.0. 9 ed. College Station, Texas: Stata Press; 2010.
28. R Development Core Team. R language and environment for statistical computing and graphics Vienna, Austria: R Foundation for Statistical Computing, Available at: http://www.R-project.org., 2010.
29. Gansner ER, North SC. An open graph visualization system and its applications to software engineering. Softw Pract Exper. 1999;S1:1–5.
30. Kirkwood BR, Sterne JA. Essential medical statistics. Oxford, UK: Wiley-Blackwell; 2003.
31. McCreesh N, Johnston L, Copas A, et al.. Evaluation of the role of location and distance in recruitment in respondent-driven sampling. International Journal of Health Geographics 2011;10(1):56.
32. Frost SD, Brouwer KC, Firestone Cruz MA. Respondent-driven sampling of injection drug users in two U.S.-Mexico border cities: recruitment dynamics and impact on estimates of HIV and syphilis prevalence. J Urban Health. 2006;83(6 suppl):i83–i97.

Supplemental Digital Content

Copyright © 2012 Wolters Kluwer Health, Inc. All rights reserved.