Brazil has nowadays a concentrated AIDS epidemic, with estimated 630,000 people living with HIV/AIDS, and a low and stable prevalence among its general population (∼0.6%).1 Notwithstanding, a recent review made evident substantially higher HIV infection rates among vulnerable populations, including people who misuse illicit drugs (pooled prevalence of 5.9%), men who have sex with men (12.6%), and women engaged in commercial sex (4.9%).2
This scenario speaks in favor of a comprehensive monitoring of such vulnerable populations, besides the regular assessment of groups that may be viewed as sentinel populations respecting different strata of the general population, such as military conscripts (a proxy for male youth) and pregnant women (a proxy for women in reproductive age).3 At-risk populations have high background HIV infection rates compared with the general population4 and may function as bridging populations between geographical and/or social clusters of high background prevalence and the population at large.1
The comprehensive monitoring of hard-to-reach populations remains a challenge worldwide. The main difficulties include the proper definition of each one of these at-risk populations and how to define a sampling frame from which potential interviewees should be selected.5 Probabilistic sampling methods are based on the known probability of choosing any given individual from a population defined as finite and countable, and as such cannot be strictly applied to hidden populations.4
To address such difficulties, different sampling methods have been proposed, among them respondent-driven sampling (RDS), a network-based method combining chain referral with weighted estimates aiming to compensate for the nonrandomness of the sampling process.6 As of 2005, the Brazilian Ministry of Health (BMoH), in partnership with the Centers for Disease Control and Prevention, introduced this methodology in Brazil.3
RDS has been largely used as a key method in the monitoring of hard-to-reach populations in different contexts worldwide,6,7 including Brazil, in recent years.3,8 Recent papers have discussed limitations of the original method, including the article by Poon et al,9 coauthored by D. Heckathorn, the originator of the method, highlighting the “significant latent variability in the recruitment process that violates assumptions of Markov chain-based methods” (verbatim). Goel and Salganik, in a theoretical analysis of recruiting processes using RDS10 and a simulation study based on 85 empirical studies,11 highlighted the effects of network bottlenecks, which may bias RDS-based estimates, making them less accurate than originally hypothesized.
To the best of our knowledge, two peer-reviewed studies using RDS12,13 have explored the geographical dimensions of the method. Burt et al12 compared the characteristics of recruitees from three studies in the region of Seattle (WA, USA), one of them using RDS, but did not enter the data into a Geographic Information System (GIS). Kral et al13 used zipcodes to geocode the interviewees recruited by two different methods, one of them RDS, in a study conducted in San Francisco (CA).
So far, no study carried out in a low-/middle-income country has explored the geographical dimension of RDS in the context of the pronounced social and demographic heterogeneity that is characteristic of such areas.
With exception made to Burt et al's12 and Kral et al's study,13 studies have explored geography as a frame where the networks recruited by RDS are inserted, but not as a key dimension of the method itself. In this sense, the current study based on data from illicit drug users from Rio de Janeiro, Brazil, addresses a key question: how successfully does an RDS-based study recruit and interview people from a hard-to-reach population in the context of Brazil's second largest city, Rio de Janeiro?
The present study analyzes data of a multicity study carried out in 10 Brazilian cities in 2009, funded by the BMoH, aiming to assess the sociodemographic and behavioral characteristics of “heavy users” [defined as such after PAHO (Pan American Health Organization) criteria of drug users who are “at particular risk to be infected by HIV and other [sexually-transmitted and blood-borne] pathogens”14], and infection rates for HIV and syphilis.
The data analyzed in the present study were obtained by the geocoding of the variable “place of residence,” using information from the Rio de Janeiro data set. Information on the characteristics of the residence (stable households, streets, shelters, shacks in a slum, etc.) of the interviewees was obtained from the standard survey and from additional data produced by the interviewees during pre- and posttest counseling, using standard forms filled out for the sake of guaranteeing the agile referral of interviewees diagnosed with any medical condition to the municipal network of health facilities. Findings from the other nine cities and analyses on behavioral and laboratory findings from Rio de Janeiro are not discussed in the present study.
Rio de Janeiro is the capital city of the state of Rio de Janeiro, situated in the most industrialized and densely populated region of the country, the southeast region. Rio de Janeiro city occupies an area of 1224.56 km2 and comprises 160 neighborhoods. Due to its huge size and large population, estimated as 6,186,710 inhabitants,15 neighborhoods were aggregated by the city administration in 33 administrative regions (ARs), roughly corresponding to the concept of boroughs in large cities such as London, the basic units analyzed in the present article. The supplemental content (see Figure, Supplemental Digital Content 1, http://links.lww.com/QAI/A167) includes a descriptive map of Rio de Janeiro ARs and health planning areas (HPAs).
To better illustrate the recruiting process, it was necessary to choose a more synthetic set of geographic units. For the sake of such additional analyses, we used the HPAs instead of the ARs, that is, we assessed networks across the catchment areas defined by the Municipal Health Secretariat for planning purposes and for the referral of patients with different medical conditions.
The inclusion criteria of the present study comprised the original criteria defined by PAHO for its CODAR (Consumidores de Drogas con Alto Riesgo/High-Risk Drug Users) manuals and surveys,14 and criteria specific to the Brazilian multicity study, as follows: legal age to provide informed consent (defined in Brazil as those individuals aged 18 years old or more), to have injected drugs at least once in the last 6 months, and/or to have used cocaine, crack, opiates, or illicit drugs (other than marijuana/hashish) by other routes (snorted, smoked, ingested) for at least 25 days in the last 6 months; to have produced a valid coupon (ie, evaluated by the visual inspection of its general aspect and watermark and the scanning of its barcoded label) delivered to the interviewee by a former recruitee/seed; to have read (or to have listened the interviewer reading the consent form, if illiterate) and signed (or fingerprinted his/her right thumb, if illiterate); not to be acutely intoxicated due to the recent use of alcohol and/or illicit drugs; and to live in the city of Rio de Janeiro.
The study was approved by the National School of Public Health/FIOCRUZ Institutional Review Board (CAAE 0114.0.031.000-0).
Six seeds were chosen in a purposive way, selected after a formative study as members of large networks and individuals belonging to different drug scenes. One seed did not generate any recruitee (ie, a “nongenerative” seed), but the other 5 recruited 605 interviewees after 11 waves, between April and October 2009. The final sample size had 5 individuals above the sample size defined a priori by the BMoH (600 recruitees other than the seeds), taking into consideration the available budget, the study timeframe, and the demography of the 10 cities included in the study. The study took place in a downtown assessment center, chosen by its centrality, accessibility (located <200 m distant from one of Rio's largest metro, train, and bus terminals), and over 2 decades long user-friendly assistance and social support delivered to disenfranchised populations, such as drug users, female sex workers, and male transvestites, and other homeless and marginalized populations.
Geocoding, linkage, management, and analysis of data were carried out in an open access GIS, Terraview 188.8.131.52 Data were geocoded using the Universal Tranverse Mercator's System and the South American Datum 69.
The digital maps used in the present analysis were downloaded from the National Institute for Geography and Statistics, IBGE (Instituto Brasileiro de Geografia e Estatística/Brazilian Institute for Geography and Statistics) at http://www.ibge.gov.br. A supplemental grid displaying the HPAs of the city of Rio de Janeiro was made available by the GIS Laboratory, FIOCRUZ (see Figure, Supplemental Digital Content 2, http://links.lww.com/QAI/A168).
RDS data were entered into the RDS Analysis Tool 5.6.0, for the sake of their visualization with the help of NetDraw 2.091.
The overall spatial distribution of the RDS sample is depicted in Figure 1. The geographic distribution of interviewees was found to be quite heterogeneous across the different ARs. Interviewees clustered around the center-north axis of the city, close to the AR where the assessment center was located and where most seeds reported to live.
Figure 2 shows the evolution of the recruiting process through space. During the first three waves, the interviewees remained clustered around the assessment center located in the neighborhood of São Cristóvão. In the subsequent waves, despite a persistent clustering in the same ARs, the referral chains slowly progressed toward the western region, more specifically toward the ARs of Jacarepaguá, Santa Cruz, and Bangu (Figure 3). Figure 4 depicts the final waves of the study, when the recruitment reached, notwithstanding in a very discrete way, some areas of the southern region, such as the ARs “Botafogo” and “Lagoa” (in the waves 8 and 9), for the first time since the study's inception. Despite this modest, but gradually increasing diffusion, key areas such as Rocinha (the largest favela of Rio de Janeiro, with over 55,000 formally registered inhabitants, and an overall population estimated to be >100,000 inhabitants) and Copacabana (one of the most famous touristic destinations of the city, with a lively night life and an open drug and prostitution scene,17 with over 160,000 inhabitants) remained untouched by the successive waves.
Visualizing the structure of the referral networks distributed across the health planning areas—HPAs (Fig. 5), it is possible to discern a clear predominance of a few HPAs (such as HPAs 1, 3.1, and 3.3). Network 2 (launched by a “seed” who reported to live in the neighborhood of Madureira, and nominated as such following the number given to “seed 2”) was found to be the most heterogeneous of the networks recruited by the study. But, even this network was basically clustered around HPA 1, followed by HPA 3.3. Figure 5 also documents a strong geographic matching between seeds and first waves (as illustrated by Networks 3 and 4), with some individuals recruited by intermediary waves behaving as bridges toward different neighborhoods.
The place of residence of some individuals (∼23% of the original sample) could not be geocoded and the respective information remained “blank.” Such blank information corresponds to people who live in the streets, shelters, hotels paid on a daily basis, areas recently occupied in favelas (slums), and so on, and in this sense corresponds to addresses that do not formally exist and/or to mobile subgroups of drug users with no stable residence.
The geographic analysis of the Rio de Janeiro RDS database represents an opportunity to assess the translation into practice of some basic assumptions of the method, among them the desired goal of obtaining a “single component” network, that is, a strongly interconnected network, ideally interconnected in a comprehensive way, with no loose or dead ends.
Although never discussed in the original propositions advanced by Heckathorn on “equilibrium,”5,18 a goal to be putatively reached after a given number of recruitment waves (hypothetically after a minimum of 6 waves, as originally proposed),4,5,18 geographic comprehensiveness is an implicit assumption of RDS because waves do not take place in an abstract frictionless environment, but rather in context. In the case of our study and of the vast majority of studies worldwide, such concrete space has been the urban space of large and middle-size cities (see Malekinejad et al6 and Semaan19 for comprehensive reviews of RDS empirical studies).
The strong geographic heterogeneity observed and the absence of a single interviewee recruited from key regions of the city, such as Rocinha and Copacabana, document bottlenecks. Such bottlenecks maybe absent in a hypothetical study with a much larger sample size, as suggested by Goel and Salganik.11 However, first-hand experience of some of the authors of the present study in the different drug scenes from Rio de Janeiro (eg, Bastos et al20) suggests that some putative bridges may not be “bridgeable” at all. In this sense, whatever the magnitude of the sample and whatever the recruiting process actually used (being it RDS or not), some bottlenecks could be viewed as structural, and as such impassable by any means.
Drug wars between factions and conflicts between factions and the police, between factions and out-of-the-law militia, and between the police and militia compounds the daily life of many favelas (slums) in Rio de Janeiro, where main drug selling and buying places are located.20,21 To believe that such conflicts will be simply ignored for the sake of science or public health is to be naive. Actually, such conflicts have been described as a major barrier for different social and public health initiatives.20,21 The key issue here is to distinguish between bridgeable and nonbridgeable bottlenecks, or to distinguish between those bottlenecks imposed by limitations associated with the method itself (such as less than optimal sample sizes)10,11 and structural bottlenecks that are not bridgeable. These latter bottlenecks should be rather explored by a combination of quantitative and qualitative methods, with the triangulation of the findings.22
In a very large and heterogeneous city as Rio de Janeiro, much has to be done in terms of how to improve the assessment of hard-to-reach populations in context. The assessment of drug users constitutes a particular challenge due to the fact they share with other hard-to-reach populations (eg, sex workers), the consequences of marginalization and stigmatization. However, in Brazil as in any other country signatory of the United Nations conventions on drug control, drug use is intrinsically linked to criminal activities because the use of illicit drug is, by definition, a criminal activity.23
Barbosa Júnior et al3 have shown that Brazilian studies targeting other populations such as commercial sex workers have violated the assumption that a single component would necessarily emerge as a result of well-conducted RDS studies. In the city of Porto Alegre, in the southern most Brazilian state, Rio Grande do Sul, female sex workers could not recruit substantial numbers of male sex workers or those interviewees defined as tranvestites/transgendered. Other limitations refer to the difficulties to recruit sex workers working in the streets following the recruiting waves driven by workers based on brothels or nightclubs, and vice versa. Such findings speak in favor of the use of seeds belonging to different networks as a key component of successful studies.
As originally discussed by Heckathorn,5,14 considering that RDS follows a first-order Markovian process, equilibrium should be reached after the sixth wave. The deep heterogeneity observed from the geographic point of view and the absence of interviewees from key areas of the city after 11 waves in our study highlight the need to complement simulation studies addressing the statistical and mathematical aspects of RDS-based studies with insights from urban geography. Even considering that some relevant variables have reached the equilibrium, the cost of excluding whole areas in a heterogeneous and overpopulated city seems to be too high.
Another area to be explored by future studies refers to the conscious or unconscious informal/intuitive “cost-effectiveness” decisions taken by interviewees, every time they distribute coupons to their peers. Considering the pervasive nature of open drug scenes and cocaine and crack-consuming people in large Brazilian cities such as Rio de Janeiro, it seems logical that interviewers will seek the easiest alternatives. If interviewees can easily deliver his/her coupons to their closest friends and buddies, why should they risk their lives seeking bridges to other drug scenes alien to him/her concrete experience and to spend hours moving back and forth an enormous city looking for hard-to-reach people like him/herself?
The very different urban structure of San Francisco (United States) and Rio de Janeiro (Brazil) may explain the marked contrasts between the findings from Kral et al13—where an RDS-based study was found to reach an optimal geographic coverage (similar to the one obtained by other research methods)—and the present study. San Francisco is a much smaller city than Rio de Janeiro, with <1 million inhabitants. From the point of view of sociodemographic parameters (income, access to proper housing, transportation, sanitation etc.), San Francisco is a much more homogenous city than Rio, and even its most deprived neighborhoods cannot be compared with Rio's favelas. Another key difference is the absence of a drug scene with heavy armed factions or militia in San Francisco, and last but not the least, the usefulness of zip codes as a geocoding unit in San Francisco, but not in Rio. In Brazil, zipcodes have a key role in the context of the postal system but are not included in any major health database in a systematic way and are useless in areas like favelas, where correspondence is delivered through community centers or public facilities, instead to households, due to the absence of formal addresses.23
Common sense suggests that drugs users would waste their time and risk their lives seeking distant acquaintances in favelas controlled by rival factions to deliver a coupon in the sole case their closer networks become saturated. That such fact could happen in the context of a modest sample of 600 users in a universe of drug users roughly estimated to include thousands of occasional and regular consumers (as inferred, for instance, by the huge amounts of drugs seized almost every day, the number of daily arrests linked to illicit drug, and other social and medical data from clinics, health centers etc.),24 seems unlikely. Our study was not restricted to injecting drug users, such as the study by Kral et al,13 so maybe their sample, of a comparable size to our own, could be viewed as more comprehensive in the sense of targeting a narrowly defined population in a much smaller and less populated area. One must observe, however, that the study of Burt et al12 cross-compared data on injecting drug users and made evident discrepancies between one RDS-based study and two other studies. Obviously, Seattle is much closer to San Francisco in terms of size and socioeconomic parameters than to Rio de Janeiro.
The absence of valid information on the place of residence among a substantial proportion of drug users interviewed by our study determines a limitation to be fully discerned as being secondary to operational difficulties (eg, low literacy levels, chaotic addressing in most Rio de Janeiro slums, especially in the recently and fast expanding areas), to the very characteristics of such population, or both.
Besides the abovementioned operational problems, drug users have been shown to be a very unstable and mobile population,25 a relevant aspect in the context of violent drug scenes, where death threats, mass executions, and armed conflicts take place every single day.20,21 In this sense, a fraction to be determined of such missing information actually corresponds to information that is impossible to obtain due to the fact some interviewees have no stable residence and move back and forth across Rio. To what extent the lost information on the place of residence of approximately 23% of our sample corresponds to information that cannot be obtained by any means remains to be better understood.
New methods to improve data collection should be sought to minimize missing information. Audio Computer-Assisted Self Interview (ACASI) had been shown to be not only reliable, but well accepted by previous Brazilian studies,26 and such auspicious findings determined the choice of such method in the present study. However, previous studies used ACASI in a randomized trial (versus face-to-face interviews), and new options should be tailored to the specific characteristics of RDS studies. To what extent any interviewing method can minimize missing information remains an open question.
Despite such limitations, our study assessed for the first time the geographic dimension of an RDS-based study in a large, heterogeneous urban area, plagued by structural violence. The complete absence of recruitees from favelas who are well-known for their unusually high levels of violence, absence of community policing, and disquieting levels of impunity,21 such as Rocinha, Cidade de Deus, Complexo do Alemão, and Complexo da Maré suggest that there are additional challenges to be addressed by RDS studies besides those that have been traditionally assessed by previous studies.
The goal of reaching equilibrium after a given number of waves may correspond to an impossible goal in a context very far from any equilibrium in the broader sense of social life and public health. Some hard-to-reach populations seem to be harder to reach than the worst case scenario elaborated after the use of standard methods. Paraphrasing Shakespeare, “there are more things in heaven and earth, Horatio, than are dreamt of in your”… methodology.
Thanks are due to Dr M.A. de Sá and G. Pereira (BMoH) for their support and advice during the multicity study and to Drs A.R. Pati Pascom (BMoH) and A. Barbosa (CDC, Brazil office) for their contribution to the implementation of RDS and other methods aiming to sample hard-to-reach populations in Brazil.
1. Brazil. Brazilian Ministry of Health. Secretaria de Vigilância em Saúde. Departamento de DST, Aids e Hepatites virais. “UNGASS - HIV/Aids, Resposta Brasileira 2008-2009. Relatório de Progresso do País”
[in Portuguese]. Brazil; 2010.
2. Malta M, Magnanini MMF, Mello MB, et al. HIV prevalence among female sex workers, drug users and men who have sex with men in Brazil: A Systematic Review and Meta-analysis. BMC Public Health
3. Barbosa Júnior A, Szwarcwald CL, Pascom ARP. Transferência de métodos para estudos em populações sob maior risco à infecção pelo HIV no Brasil. Cad Saude Publica
. 2010;27(suppl 1):S36-S44.
4. Magnani R, Sabin K, Saidel T, Heckathorn D. Review of sampling hard-to-reach and hidden populations for HIV surveillance. AIDS
5. Heckathorn DD. Respondent-driven sampling: a new approach to the study of hidden populations. Soc Probl
6. Malekinejad M, Johnston LG, Kendall C, et al. Using respondent-driven sampling methodology for HIV biological and behavioral surveillance in international settings: a systematic review. AIDS Behav
. 2008;12(suppl 4):105-130.
7. Strathdee SA, Lozada R, Ojeda VD, et al. Differential effects of migration and deportation on HIV infection among male and female injection drug users in Tijuana, Mexico. PLoS One
8. Kendall C, Kerr LR, Gondim RC, et al. An empirical comparison of respondent-driven sampling, time location sampling, and snowball sampling for behavioral surveillance in men who have sex with men, Fortaleza, Brazil. AIDS Behav
. 2008;12(suppl 4):S97-S104.
10. Goel S, Salganik MJ. Respondent-driven sampling as Markov chain Monte Carlo. Stat Med
11. Goel S, Salganik MJ. Assessing respondent-driven sampling. Proc Natl Acad Sci
12. Burt RD, Hagan H, Sabin K, et al. Evaluating respondent-driven sampling in a major metropolitan area: Comparing injection drug users in the 2005 Seattle area national HIV behavioral surveillance system survey with participants in the RAVEN and Kiwi studies. Ann Epidemiol
13. Kral AH, Malekinejad M, Vaudrey J, et al. Comparing respondent-driven sampling and targeted sampling methods of recruiting injection drug users in San Francisco. J Urban Health
14. Pan American Health Organization (PAHO). Encuestas de comportamiento en CODAR. Herramientas básicas para la vigilancia de segunda generación de VIH y otras infecciones en Consumidores de drogas con Alto Riesgo [in Spanish]
. Resumen. 2004. Available at: http://www.paho.org/spanish/ad/fch/ai/Resumen
%20CODAR.pdf. Accessed July 10, 2010.
18. Heckatorn D. Respondent-driven sampling II: deriving valid population estimates from chain-referral samples of hidden populations. Soc Probl
19. Semaan S. Time-space sampling and respondent-driven sampling with hard-to-reach populations. Methodol Innov Online
20. Bastos FI, Caiaffa W, Rossi D, et al. The children of mama coca: coca, cocaine and the fate of harm reduction in South America. Int J Drug Policy
21. Zaluar, AM. Youth and Drug Trafficking in the City of Rio de Janeiro. In: Emmanuel R, George H, eds. New Approaches to Public Security and Drug Policy
. Petrópolis and Londres: Editora Vozes and International Council on Security and Development (ICOS); 2009:161-172.
22. Emmanuel F, Blanchard J, Zaheer HA, et al. The HIV/AIDS Surveillance Project mapping approach: an innovative approach for mapping and size estimation for groups at a higher risk of HIV in Pakistan. AIDS
. 2010;(24 suppl 2):S77-S84.
23. Gracie R. Assessing Slum Health in Rio de Janeiro Using a Geographic Information System (GIS). Presentation delivered at the meeting on slum health and related issues organized by the University of Rio de Janeiro and UC Berkeley, Rio de Janeiro, July 28-30, 2010; Rio De Janeiro, Brazil.
25. Hahn JA, Page-Shafer K, Ford J, et al. Travelling young injection drug users at high risk for acquisition and transmission of viral infections. Drug Alcohol Depend
26. Simoes AA, Bastos FI, Moreira RI, et al. A randomized trial of audio computer and in-person interview to assess HIV risk among drug and alcohol users in Rio De Janeiro, Brazil. J Subst Abuse Treat