Background: Respondent-driven sampling is a novel variant of link-tracing sampling for estimating the characteristics of hard-to-reach groups, such as HIV prevalence in sex workers. Despite its use by leading health organizations, the performance of this method in realistic situations is still largely unknown. We evaluated respondent-driven sampling by comparing estimates from a respondent-driven sampling survey with total population data.
Methods: Total population data on age, tribe, religion, socioeconomic status, sexual activity, and HIV status were available on a population of 2402 male household heads from an open cohort in rural Uganda. A respondent-driven sampling (RDS) survey was carried out in this population, using current methods of sampling (RDS sample) and statistical inference (RDS estimates). Analyses were carried out for the full RDS sample and then repeated for the first 250 recruits (small sample).
Results: We recruited 927 household heads. Full and small RDS samples were largely representative of the total population, but both samples underrepresented men who were younger, of higher socioeconomic status, and with unknown sexual activity and HIV status. Respondent-driven sampling statistical inference methods failed to reduce these biases. Only 31%–37% (depending on method and sample size) of RDS estimates were closer to the true population proportions than the RDS sample proportions. Only 50%–74% of respondent-driven sampling bootstrap 95% confidence intervals included the population proportion.
Conclusions: Respondent-driven sampling produced a generally representative sample of this well-connected nonhidden population. However, current respondent-driven sampling inference methods failed to reduce bias when it occurred. Whether the data required to remove bias and measure precision can be collected in a respondent-driven sampling survey is unresolved. Respondent-driven sampling should be regarded as a (potentially superior) form of convenience sampling method, and caution is required when interpreting findings based on the sampling method.
From the aDepartment of Infectious Disease Epidemiology, Faculty of Epidemiology & Population Health, London School of Hygiene and Tropical Medicine, UK; bDepartment of Veterinary Medicine, University of Cambridge, UK; cMRC/UVRI Uganda Research Unit on AIDS, Entebbe, Uganda; dSchool of International Development, University of East Anglia, Norwich, UK; eDepartment of Medical Statistics, Faculty of Epidemiology & Population Health, London School of Hygiene and Tropical Medicine, UK; fKing's College London, Institute of Psychiatry, Department of Biostatistics, UK; gTulane University School of Public Health & Tropical Medicine, Department of International Health & Development, Center for Global Health Equity, New Orleans, LA; hResearch Department of Infection and Population Health, University College London, UK.
Submitted 25 May, 2011; accepted 23 September, 2011.
R.G.W. is funded by a Medical Research Council (UK) Methodology Research Fellowship (G0802414), the Gates Foundation (19790.01), and the EU FP7 (242061). S.D.W.F. is funded by the National Institutes of Nursing Research (grant NR10961), the National Institute on Drug Abuse (grant DA24998), and by a Royal Society Wolfson Research Merit Award. J.S., J.K., M.N.T., and R.N. are funded by the Medical Research Council (UK). F.J. is funded by the National Institute for Health Research. The general population cohort in Uganda is funded by the Medical Research Council (UK). The authors reported no other financial interests related to this research.
Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article (www.epidem.com).
Editors' note: A commentary on this article appears on page 148.
Correspondence: Richard G. White, Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, Keppel Street, London. E-mail: email@example.com.