Share this article on:

Incorporating the Service Multiplier Method in Respondent-Driven Sampling Surveys to Estimate the Size of Hidden and Hard-to-Reach Populations: Case Studies From Around the World

Johnston, Lisa G. PhD*; Prybylski, Dimitri PhD†‡; Raymond, H. Fisher DrPH; Mirzazadeh, Ali MD, PhD; Manopaiboon, Chomnad MA; McFarland, Willi MD, PhD

Sexually Transmitted Diseases: April 2013 - Volume 40 - Issue 4 - p 304–310
doi: 10.1097/OLQ.0b013e31827fd650
Original Study

Background: Estimating the sizes of populations at highest risk for HIV is essential for developing and monitoring effective HIV prevention and treatment programs. We provide several country examples of how service multiplier methods have been used in respondent-driven sampling surveys and provide guidance on how to maximize this method’s use.

Methods: Population size estimates were conducted in 4 countries (Mauritius— intravenous drug users [IDU] and female sex workers [FSW]; Papua New Guinea—FSW and men who have sex with men [MSM]; Thailand—IDU; United States—IDU) using adjusted proportions of population members reporting attending a service, project or study listed in a respondent-driven sampling survey, and the estimated total number of population members who visited one of the listed services, projects, or studies collected from the providers.

Results: The median population size estimates were 8866 for IDU and 667 for FSW in Mauritius. Median point estimates for FSW were 4190 in Port Moresby and 8712 in Goroka, Papua New Guinea, and 2,126 for MSM in Port Moresby and 4200 for IDU in Bangkok, Thailand. Median estimates for IDU were 1050 in Chiang Mai, Thailand, and 15,789 in 2005 and 15,554 in 2009 in San Francisco.

Conclusion: Our estimates for almost all groups in each country fall within the range of other regional and national estimates, indicating that the service multiplier method, assuming all assumptions are met, can produce informative estimates. We suggest using multiple multipliers whenever possible, garnering program data from the widest possible range of services, projects, and studies. A median of several estimates is likely more robust to potential biases than a single estimate.

Findings and suggestions from population size estimations in 4 countries among HIV high-risk populations using the service multiplier method with respondent-driven sampling are presented.

From the *Global Health Sciences, University of California, San Francisco, San Francisco, CA; †Centers for Disease Control and Prevention, Global AIDS Program, Asia Regional Office, Nonthaburi, Thailand; ‡Centers for Disease Control and Prevention, Division of Global HIV/AIDS, Atlanta, GA; §San Francisco Department of Public Health, San Francisco, CA; and ∥Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical Sciences, Tehran, Iran

Disclaimer: The findings and conclusions in this paper are those of the authors and do not necessarily represent those of the Centers for Disease Control and Prevention.

Correspondence: Lisa G. Johnston, PhD, Nieuwezijds Voorburgwal 64F, 1012 SC Amsterdam, the Netherlands. E-mail:

Received for publication June 23, 2012, and accepted November 21, 2012.

Back to Top | Article Outline


Estimating the sizes of populations at highest risk for HIV such as injecting drug users (IDU), female sex workers (FSW), and men who have sex with men (MSM) is essential for developing and monitoring effective HIV prevention and treatment programs. These populations are disproportionately affected by the epidemic, even in countries where generalized epidemics prevail.1 An effective public health response requires knowing the true population size at risk for HIV to understand the burden of disease, gauge service coverage, guide resource allocation, and advocate for programs to reach high-risk populations.

Estimating the population sizes of hidden and hard-to-reach populations is extremely challenging for numerous reasons. First, the behaviors practiced by these populations are often illegal and subject to high levels of stigma and discrimination, resulting in their preference to avoid being identified or counted. Severe underreporting is likely in household surveys that include behavioral questions on stigmatized behaviors because many members of these populations are unwilling to report accurately and also because they may not reside in typical household settings. In addition, the hidden nature of these populations makes direct counts inaccurate because the counts may only be conducted at locations where populations are visible and not always easily identifiable by observation. Moreover, certain segments of these populations may be more hidden or diffuse than others, resulting in the exclusion of important subpopulations in size estimations.

Currently, there are several methods being adapted to estimate population sizes of HIV high-risk populations. Some use data collected from the general population such as the network scale-up method,2,3 whereas others use data collected directly from the population such as mapping and enumeration, capture-recapture, and nomination methods.4–10 There is a need to identify effective and practical low-cost methods that can be used in routine settings to provide data-driven population size estimates rather than anecdotal estimates or pure guesses.1 A promising and apparently simple and inexpensive population size estimation technique for HIV high-risk populations is the service multiplier method (SMM).1,11 Briefly, this method uses 2 overlapping data sources specific to the population being estimated, one being a count of clients accessing a service (e.g., drug treatment, HIV testing, or participation in other research projects or studies) and the other a representative survey (described in greater detail in the “Methods” section).1 For simplicity of illustration, if a sexually transmitted infection (STI) screening program provided service to 1000 FSW in 2010 and a survey of all FSW in the same area finds 10% using the service that year, then the total number of FSW is 10,000. The strong appeal of the SMM is the application of routine surveys (e.g., integrated bio-behavioral surveillance surveys [IBBS]) that include questions on service use, project, and study participation.1 Thus, for little additional energy and cost, size estimations of HIV high-risk populations can be easily obtained.

The SMM is highly dependent on the availability and quality of data collected for other purposes (i.e., participation in a project, service data and participation in an IBSS) and subject to meeting several challenging assumptions. The major assumptions are that (1) the population being counted has non-zero probability of inclusion in both sources, (2) no individual be counted more than once in each multiplier (nonduplicated data), (3) the 2 data sources are independent of each other (i.e., inclusion in one source is not related to inclusion in the other), and (4) one data source (the survey) be representative of the sampled population.1

The last assumption is a broader challenge for research in HIV high-risk populations. At present, the representativeness and rigor of surveys among these populations have been improved beyond convenience samples at venues or facilities. One common probabilistic sampling method used worldwide purported to obtain representative samples of these populations is respondent-driven sampling (RDS).12,13 Respondent-driven sampling has the advantage of obtaining samples of hidden and hard-to-reach populations not necessarily identifiable at venues. Respondent-driven sampling uses the recruitment efforts and social network sizes of sample participants to derive estimates about the target population and is most useful for sampling socially networked populations who have no sampling frame and are difficult to recruit into surveys due to their stigmatized and often illegal behaviors. The recruitment process begins with a convenience sample of population members (“seeds”) who use a fixed number of coupons to recruit members of their social network. The objective is to manage a peer-to-peer recruitment process whereby the generation of long recruitment chains results in a final sample that is independent of the initial convenience sample of seeds. Data collected with RDS methods are adjusted based on each participant’s social network size (e.g., the number of people they know who fulfill the RDS survey eligibility) and similarities in characteristics of persons who associate with each other.13

With the global scale up of HIV prevention and care services and the expansion of data collection among populations at high risk for infection using RDS,14 the ingredients for the SMM for size estimation are present throughout the world. Unfortunately, few applications of the SMM using RDS have been published in the scientific literature.11,15 In this study, we provide several country examples of how the SMM has been used in RDS surveys, share lessons learned, and provide guidance on how to maximize this method’s use. Specifically, we present data from RDS-SMM studies conducted in Mauritius, Papua New Guinea, Thailand, and the United States. The objective of this study is to increase awareness and to improve future HIV surveillance assessments that use RDS and the SMM.

Back to Top | Article Outline


Selection of Studies and Eligibility Criteria

Based on availability of data to the authors, diversity of the populations and locations, the existence of services, prior publication, and prior population size estimates, we purposely chose recent surveys conducted in 6 urban areas in 4 countries among 3 at-risk populations—FSW, IDU, and MSM (Table 1). Each of these surveys was conducted with the involvement of 1 or more of the authors of this article and was selected to provide examples from different geographical regions, cultural contexts, and population types. Specifically, the locations and the populations are as follows: Mauritius—IDU15 and FSW16; Papua New Guinea, Port Moresby—FSW and MSM and Goroka—FSW17,18; Thailand—Bangkok, IDU and Chiang Mai—IDU19; and United States, San Francisco—IDU.20 For San Francisco, data from 2 rounds of the National HIV IBBS for IDU in 2005 and 2009 were used to show size estimations of the same population at 2 time points.21

In Mauritius, IDU and FSW were 15 years or older and living in Mauritius. In addition, IDU were males and females injecting drugs in the previous 3 months, and FSW were females having vaginal anal or oral sex in the last 6 months with a male in exchange for money or goods. In Papua New Guinea, FSW and MSM were 16 years or older and living in the respective survey city. In addition, FSW were females having exchanged sex for money or goods and services in the past 12 months, and MSM were defined as males having had sex with another man in the past 12 months. In Thailand, IDU were defined as males or females 18 years and older injecting drugs in the previous 6 months and living or working in Bangkok or Chiang Mai. In San Francisco, IDU were defined as males or females 18 years of older having injected illicit drugs in the past 12 months and a resident of San Francisco. All RDS surveys required that participants, with the exception of seeds, present a coupon to enroll.

Back to Top | Article Outline

IBBS Sampling Methods

Each of the RDS surveys presented here included essential methodological requirements such as recruitment initiation with a diverse set of seeds identified through local contacts and well connected to and trusted by the target population, the use of a quota of recruitment coupons per participant, incentives for participation and peer recruitment, collection of social network size data from each participant, tracking who recruited whom, facilitation of long recruitment chains and consequent attainment of equilibria for key variables of interest, and inclusion of a projected design effect of at least 2 in the sample size calculation.14 The interviewing methods, questionnaires, incentives, and specimen collection varied by location. Respondent-driven sampling surveys were approved through respective ethical review committees and fulfilled a consent process before recruits could enroll.

Back to Top | Article Outline

Service Multiplier Data

The SMM used 2 sources of information: For the first source, institutions or organizations provided aggregate count data of the number of population members (e.g., MSM) who met the eligibility criteria of the corresponding RDS survey and who had received services or participated in a project or study during a specific period. Organizations were asked to estimate or provide an exact count of each person who received the service or participated in a project or study only one time, regardless of whether individuals sought services or participated numerous times. For the second source, data were collected through the RDS survey by asking each participant whether he/she had attended a particular corresponding service, project, or study at least one time during a specific period matched to that provided by the institutions or organizations. For the RDS surveys conducted in Papua New Guinea and Thailand, the median interview date among respondents was used as the reference end date to match the period with the service data (Port Moresby FSW—November 5, 2006; MSM—November 5, 2006; Goroka FSW—January 12, 2006; Bangkok IDU—July 15, 2009; Chiang Mai IDU—May 31, 2009) (D. Prybylski, personal communication). For the RDS surveys in Mauritius (FSW: period from August 2009 and August 2010; IDU: period from October 2008 and October 2009) (L. G. Johnston, personal communication) and San Francisco (period from May 8, 2004 to December 11, 2005, for 2005 data and from July 26, 2008, to December 11, 2009, for 2009 data) (H. F. Raymond, personal communication,), a specified timeframe previous to the start of the RDS survey was used to match with the service data. These data were analyzed using RDSAT ( to derive adjusted (and representative) population proportions used in calculating the population size with the count from each provider.

Back to Top | Article Outline

Multiplier Formula

The numerator for the multiplier formula is the count of persons who attended a service, project, or study during a specified period, and the denominator is the proportion provided through the RDS survey. The mathematical formula to calculate the total population size: N = 1/P * M = M/P,1 where N is the estimated population size, P is the RDSAT-adjusted proportion, and M is the total number of those who reported attending a service, project, or study listed in the RDS survey.

Back to Top | Article Outline

Confidence Intervals Around the Population Size Estimates

To calculate confidence intervals (CIs) around the population size estimates, we used the RDSAT standard errors [SEs] from the previously mentioned P. The CI is calculated using a bootstrap procedure and replicate estimates for the lower and upper limit of a 95% CI (percentiles of 2.5 and 97.5) for P and the uncertainty around the number of individuals who actually received services or participated in a project or study. The CIs also used a normal distribution as a good approximation of the Poisson distribution with equal mean and variance to M, where M is the number of individuals who received services or participated in a project or study from the listed services, projects, or studies and its variance; α, type I error (set at a maximum 0.05); and Z1 − α/2, the normal standard transformation. When the type I error is 0.05, Z1 − α/2 is equal to 1.96.

The variances for M and P were combined by using the following formula (delta method)22:

Back to Top | Article Outline


Population size estimate results are shown in Table 2. In Mauritius, 5 service multipliers were available for IDU, including substance use treatment, outreach and education, and needle exchange programs.15 Population size estimates ranged from 5699 to 10,444 (median, 8866). Four multipliers were available for FSW in Mauritius, with corresponding population size estimates ranging from 254 to 945 (median, 667).

In Papua New Guinea, one service data source (Save the Children’s Sapot Intervention Project providing peer outreach, voluntary testing, and counseling and STI clinic services) included all 3 samples, resulting in population size estimates of 4212 for FSW in Port Moresby, 8712 for FSW in Goroka, and 2126 for MSM in Port Moresby.

In Thailand, one multiplier (Tenofovir intervention study) was available for the IDU population in Bangkok, producing an estimate of 4200. In Chiang Mai, 2 multipliers were available: enrollment in the Suboxone randomized controlled trial ( and in the Famai methadone treatment clinic. Respective population size estimates were 600 and 1500.

For estimates of the population size of IDU in San Francisco, 4 multipliers were available in 2005 and 6 in 2009. The 2005 multipliers included HIV cases for IDU reported to the health department and alive at the end of 2005, participants of an ongoing cohort study (, the number arrested in San Francisco county in 2004, and clients of Walden House (substance use treatment center) at the time of the RDS survey. These estimates ranged from a low of 10,130 to 23,779 (median, 15,334) IDU. The 2009 multipliers included HIV cases among IDU reported and alive at the end of 2009, current clients of Walden House, emergency room treatment for overdose and surviving in 2008, using any public anonymous HIV testing site in 2008, testing for HIV at the municipal STI clinic in 2008, and being screened for an STI at the clinic in 2008. Population point estimates for IDU in 2009 ranged from 5200 to 81,500 (median, 15,554).

Back to Top | Article Outline


This article describes the application of the SMM in conjunction with RDS surveys to estimate the size of 3 hard-to-reach populations at elevated risk for HIV infection in 6 areas on 3 different continents. Implementing RDS and finding service multipliers proved feasible for all target populations. More than 1 service multiplier was found for 4 of the 8 different target populations. For San Francisco IDU, the median point estimate calculated in 2005 was comparable with that in 2009 using 2 separate RDS surveys and several multipliers. Nevertheless, there was a considerable variation by different service sources in the same year and across years using the same service source.

In the absence of a gold standard, the plausibility of our estimates may only be possible to assess by comparison with estimates in other contexts using different methods (Table 3) and by judging the face validity of the estimate as a percent of the total adult population. According to our median estimates, IDU make up approximately 2.5% of the 2005 adult population of San Francisco (18 years and older, 2005: 610,497), 2.2% of the 2009 adult population of San Francisco (18 years and older, 2009: 697,947), 0.95% of the adult population in Mauritius (15–49 years old, 2009: 930,000), 0.11% of Chang Mai (15–49 years old, 2009: 938,827), and 0.13% of Bangkok (15–49 years old, 2009: 3,116,474). Our estimate for San Francisco is close to the 2.35% estimated by Brady et al.,23 falling above the median of 0.96% but well within the range for 96 US metropolitan areas of 0.37% to 3.36%. A review of the prevalence of IDU in the adult population of developing regions of the world projects a range of 0.004% to 1.47% for Asia and the Pacific and 0.0003% to 0.35% for the Middle East and Africa.4,24 Although these yardsticks have large ranges, our estimates for Chiang Mai and Bangkok fall in the regional range, whereas our estimate for Mauritius seems above the range for Africa and above an independent estimate of 0.12%.4,24 However, Mauritius may not be comparable with other African countries because of its long history of drug injection and its standing as having the second highest per-capita rate of illicit opiate use in the world, after Iran.25 Our estimate of 4200 IDU in Bangkok is also comparable with the 3595 estimated by multiplier method using 17 methadone clinics.26 The population size of IDU in Western Europe has been estimated from 0.04% to 1.06% on a national level.4 In addition, the Joint United Nations Programme on HIV/AIDS Workbook method for estimating HIV prevalence and incidence notes that few countries outside highly urban developed countries will have more than 0.7% of the adult population as IDU ( Thus, the numbers of IDU in San Francisco seem relatively high compared with a national estimate, but may be plausible for a small but densely populated urban area with a historical concentration of drug users.

Our estimates for FSW fall within the range of other estimates from Africa (0.1%–12.0% of adult women) (10) in the case of Mauritius (0.17%; female, 15–49 years old; 2009: 388,796) and high in the case of Port Moresby (4.77%; female, 15–49 years old; 2006: 87,678). Of note, the prevalence of FSW was 1.5% for the whole neighboring Indonesian-controlled province of Papua,10 which is relatively high for the nation; therefore, our estimate may be plausible given that Port Moresby is an urban port in the independent country of Papua New Guinea, on the same island of New Guinea.

Finally, the world literature cites the proportion of adult men who practice male-to-male sex as 2% to 5%27 and is consistent with our estimate for Port Moresby (2.06%; male, 15–49 years old; 2006: 103,336). We propose that our current estimates can form a data-based starting point upon which additional, more rigorous data can be added and to which other population size estimation methods can be compared.

Aside from the plausibility of results, in the absence of a gold standard, the SMM is difficult to validate as a method apart from judging the soundness of the underlying assumptions, whether they can be met and the likely impact of biases to which the method is vulnerable. The underlying assumption most often and most severely violated is the nonindependence of the 2 data sources. It is plausible that the subgroups of the target population most likely to use services or participate in projects or studies are also those most likely to participate in health surveys (i.e., by virtue of self-identifying as part of the population, by visibility in the area, and by increased access to services and relevant information). This positive correlation will result in an underestimation of the total population size (i.e., the overlap between service use and RDS survey participation is exaggerated). This may have been the case for IDU in Chang Mai, where the RDS survey was conducted in the same location as the Suboxone study and may explain the substantially lower population size estimate using this multiplier compared with the methadone clinic. This type of bias is further compounded when the RDS survey is conducted in collaboration with service provider staff, at the program site, or initiated with service client seeds. On the positive side, this bias may be mitigated by the RDS methodology in that most samples strive to have long chains of recruitment, whereby the longer the chains, the higher the chances that less accessible members of the population are recruited into the RDS survey. Moreover, the proportion of the population estimated to use services or participate in projects and studies is adjusted to account for recruitment patterns. As a case in point, consider the likely greater bias with venue-based surveys conducted in settings where the target population is visibly concentrated and services provide outreach education and other programs. Facility-based sampling designs (e.g., at substance use treatment centers) may be hopelessly biased by self-selection of the 2 sources and therefore unsuitable for the SMM size estimation method. Although the multiple-sample capture-recapture method can model nonindependence and therefore adjust estimates,6 the SMM cannot because individuals are not linked across data sources. The use of a total count and history of service use (rather than individual data linkage) has the strength of preserving anonymity but carries the limitation of inability to adjust for nonindependence.

Another source of bias arises when specific population definitions between the 2 data sources do not match (i.e., the eligibility criteria of the RDS survey and service data source classifications of their clients, the timeframe of the behaviors in question). Of note, the SMM does not assume that the service is provided to a representative “sample” of the target population (in fact, service provision is usually not equally distributed), but rather, the service population is entirely contained within the RDS survey target population. Thus, a service may only be provided to one part of the population (e.g., brothel-based FSW) as long as that segment of the total population is contained within the RDS survey target population and estimated within the multiplier (e.g., a service reaching 15% of brothel-based FSW who make up 10% of all FSW would reach 1.5% of the total FSW population). As long as the eligibility criteria include those counted in the service, the assumption is not violated. However, if the opposite occurs that part of the service area falls outside the catchment of the RDS survey or the survey eligibility criteria exclude the clients in the service count, by geography or definition, then biases will result with respect to the population of inference. For example, if an RDS survey is planned for a defined city using residence as an eligibility criterion, then a service including persons outside the city will be upward biased (i.e., the count is exaggerated relative to the proportion estimated in the survey). The geographic boundaries of the RDS survey (either as planned or actually occurs in practice) and the service also carry uncertainty as to how far inference can be extrapolated. For example, it is likely that both service users and survey participants were concentrated in Port Louis, the capital of Mauritius, although in principle, the target area was the entire main island. Similarly, Bangkok is a vast and populous city creating unclear boundaries for the extent of recruitment for the Tenofovir study and the RDS survey.

Other sources of bias result from the quality of the service and RDS survey data, with myriad possible errors. Common service data errors include multiple counts of the same client (e.g., counts of service provision rather than individuals), lack of ascertainment of population membership in the program data (e.g., not all women are asked if they are FSW), and the assumption that clients are members of the population (e.g., all women are assumed to be FSW). Common RDS survey data errors include poor recall of the particular service (e.g., obtaining HIV testing at a different venue than the source of the multiplier data) and of the dates of the service. The last issue may be complicated by the recall of RDS survey participants and question clarity. For example, the timeframe used to match data should be specified (e.g., June 1, 2010–December 31, 2010) rather than referred to as the “preceding 6 months,” which could result in a mismatch of timeframe, especially if the RDS survey recruits participants over a long period. Given the diverse possibilities for error, the direction of these above biases may be unpredictable.

Finally, accurate estimates are dependent on the quality of the RDS sampling method. One underlying assumption common to the points above and essential to the overall approach is that the RDS survey is representative of the target population. This assumption is in itself difficult to verify, even for purposes of measuring and tracking HIV prevalence and risk indicators. At present, obtaining representative samples of hidden populations is an area of active research also operating without a gold standard in most instances. The persistence of stigma, discrimination, and criminalization of drug use, commercial sex, and male-male sex preclude having complete sampling frames and accurate data for these populations. Furthermore, RDS can produce large CIs around an estimate resulting in population size estimations that are difficult to interpret. A design effect of 2 or lower has been most commonly used.14 However, suggestions to use design effects larger than 2, perhaps closer to 3 or 4, could improve the accuracy of estimates by increasing sample sizes.28,29

Although the previously mentioned considerations are related to systematic errors, there is also uncertainty generated by random error. We present an approach to calculating standard errors that takes into account the variance from the prevalence of service use from RDS data and from the service multiplier source. However, the method produced some implausible CIs. For example, the lower bound for IDU in Chang Mai was 40, whereas the same service multiplier source counted 69. Several estimates were too imprecise to be practically useful or credible. For example, the number of IDU in San Francisco based on 1 estimate had a CI from 658 (fewer than known living IDU registered in HIV clinics) to 162,343 (one fourth the adult population).

To be fair, other methods to estimate the size of hidden populations are also subject to potentially severe biases and uncertainties. Methods that rely on mapping and visual counts are likely to miss the more hidden segments of the population, probably to a greater extent than RDS surveys. True capture-recapture studies are also required to meet several assumptions that are difficult to verify.1 The network scale-up method is vulnerable to several biases related to what information persons have (level of transparency) or can recall about their social networks.2,3 As stressed previously, until we have a true census of the populations at the highest risk for HIV, we will not be able to fully assess the validity of the methods or accuracies of their results. In the meantime, we can bolster our confidence in population size estimates by implementing existing methods as rigorously as possible, applying and comparing multiple methods, understanding their limitations, reconciling differences, searching for agreement between them, and qualitatively gauging their plausibility with stakeholders who are members of the target population or providers working with the population. Because the SMM makes use of existing data and RDS has rapidly become a sampling method widely used for IBBS,14 the approach presented in this study has strong appeal, low cost (the indirect cost of time for adding additional questions to an existing RDS survey and obtaining service data to calculate estimates), and high feasibility in many settings.

Back to Top | Article Outline


Several conclusions and suggestions are evident in the lessons learned in SMM experiences presented here. Given the primary threat to validity through nonindependence of the RDS survey and service use, we hold RDS to be superior to facility-based samples for estimating population size when using the SMM. The limitation can be further addressed by explicitly delinking RDS implementation from service sites and personnel, choosing diverse seeds with respect to service sites and users and nonusers, and tracking equilibrium with respect to service use and use of estimates adjusted for network size and recruiter-recruit similarities. We further suggest using multiple multipliers whenever possible, garnering service data from the widest possible range of sources. A median of several estimates is likely more robust to potential biases than a single estimate. That said, service data sources that are known to be biased or of poor data quality should not be used. Although 95% CIs needs to account for multiple sources of statistical uncertainties, we suggest the use of “plausibility bounds” for practical applications of size estimates. Without a gold standard, the quality of data that derive the individual data points combined with the judgment of the stakeholders helps set the lower and upper limits of what are plausible and useful population size estimates. For example, plausible lower bounds for estimates should not approach the complete count of a registry or client census and upper bounds should not tax common sense as a very large proportion of the total adult population. A practical set of rules for plausibility bounds can be developed using data, to the greatest extent possible, to estimate population sizes using multiple multipliers and other methods.30 Lastly, the highest quality of service data possible should be selected for multipliers. Rules on the quality of service, project, or study data should be established before RDS survey implementation, including ability to unduplicate clients, availability of cleaned data for the most recent complete calendar year, geographic catchment area, definitions of participant or client risk populations, and completeness of ascertaining and recording population membership of participants or clients.

We suggest triangulating different methods (capture-recapture, mapping, network scale-up, etc.) and SMM to the extent possible in estimating population sizes. Each method has its own particular strengths and weaknesses and triangulating multiple population size estimates will allow for cross-checks and validation, especially if arrived at using different methods or independent survey data sources. If independent results are in a similar range, they increase confidence in the estimate. We also suggest presenting the range of point estimates (and 95% CIs) generated from different service multipliers to reflect the level of uncertainty inherent in the size estimation process.

We hold that the quality of service data can be improved by sharing results of the size estimation calculations with the collectors, owners, and users of these data. For instance, population size estimations from the surveys presented here have been shared through dissemination meetings, reports, and presentations at conferences. In addition to understanding the importance of complete population membership ascertainment and accurate client counts, such stakeholders can corroborate or refute the plausibility of results. Most importantly, the feedback loop will foster greater use of data—including other results of RDS surveys, service data, and the resulting size estimations. Population size estimates are needed to advocate for and plan prevention and care services, to set targets for delivering programs, and to assess their reach and coverage. Broadly, the size of the population at risk is fundamental to understanding the distribution and determinants of any disease and therefore to crafting an effective response.

Back to Top | Article Outline


1. UNAIDS/WHO Working Group on Global HIV/AIDS and STI Surveillance. Estimating the Size of Populations Most at Risk to HIV Infection: Participant Manual. 2010. Available at: http:// Accessed January 3, 2013.
2. Bernard RH, Hallet T, Iovita A, et al.. Counting hard-to-count populations: The network scale-up method for public health. Sex Transm Infect 2010; 86: ii11–ii15 doi:10.1136/sti.2010.044446.
3. Salganik MJ, Fazito D, Bertoni N, et al.. Assessing network scale-up estimates for groups most at risk of HIV/AIDS: Evidence from a multiple-method study of heavy drug users in Curitiba, Brazil. Am J Epidemiol 2011; 174: 1190–1196.
4. Aceijas C, Stimson GV, Hickman M, et al.. United national reference group on HIV prevention and care among iDU in developing and transitional countries. Global overview of injecting drug use and HIV infection among injecting drug users. AIDS. 2004; 18 (17): 2295–2303.
5. Heckathorn DD, Semaan S, Broadhead RS, et al.. Extensions of respondent-driven sampling: A new approach to the study of injection drug users aged 18–25. AIDS Behav 2002; 6: 55–67.
6. Hickman M, Hope V, Platt L, et al.. Estimating prevalence of injecting drug use: A comparison of multiplier and capture-recapture methods in cities in England and Russia. Drug Alcohol Rev 2006; 25: 131–140.
7. Paz-Bailey G, Jacobson JO, Guardado ME, et al.. How many men who have sex with men and female sex workers live in El Salvador? Using respondent-driven sampling and capture-recapture to estimate population sizes. Sex Transm Infect 2011; 87: 279–282.
8. Uuskűla A, Rajaleid K, Talu A, et al.. Estimating injection drug use prevalence using state wise administrative data sources: Estonia 2004. Addition Res Theory 2007; 16: 411–424.
9. Vadivooa S, Gupteb MD, Adhikaryc R, et al.. Appropriateness and execution challenges of three formal size estimation methods for high-risk populations in India. AIDS 2008; 22 (suppl 5): S137–S148.
10. Vandepitte J, Lyerla R, Dallabetta G, et al.. Estimates of the number of female sex workers in different regions of the world. Sex Transm Infect 2006; 82: iii18–iii25 doi:10.1136/sti.2006.020081.
11. Heimer R, White E. Estimation of the number of injection drug users in St. Petersburg, Russia. Drug Alcohol Depend 2010; 109: 79–83.
12. Heckathorn D. Respondent driven sampling II: Deriving valid population estimates from chain-referral samples of hidden populations. Soc Probl 2002; 49: 11–34.
13. Salganik MJ, Heckathorn DD. Sampling and estimation in hidden populations using respondent-driven sampling. Soc Methodol 2004; 34: 193–239.
14. Malekinejad M, Johnston LG, Kendall C, et al.. Using respondent-driven sampling methodology for HIV biological and behavioral surveillance in international settings: A systematic review. AIDS Behav 2008; 12 (suppl 1): 105–130.
15. Johnston LG, Saumtally A, Corceal S, et al.. High HIV and hepatitis C prevalence amongst injecting drug users in Mauritius: Findings from a population size estimation and respondent driven sampling survey. Int J Drug Policy 2011; 22: 252–258.
16. Johnston LG, Corceal S. Unexpectedly high injection drug use, HIV and hepatitis C prevalence among female sex workers in the Republic of Mauritius. AIDS Behav 2012. [Epub ahead of print].
17. Maibani-Mitchie G, Kavanamur D, Jikian M, Siba P. Evaluation of the Poro Sapot Project: Baseline and Post Intervention Studies. An HIV prevention program among FSWs in Port Moresby and Goroka and among MSM in Port Moresby, Papua New Guinea. Prybylski D, Yeka W, Mateo R, eds. PNG Institute of Medical Research, FHI Technical Report to USAID, Port Moresby 2007.
18. Yeka W, Maibani-Michie G, Prybylski D, et al.. Application of respondent driven sampling to collect baseline data on FSWs and MSM for HIV risk reduction interventions in two urban centres in Papua New Guinea. J Urban Health 2006; 83: 60–72.
19. Yongvanitjit K, Manopaiboon C, Aramrattana A, et al. Risk behaviors and high HIV prevalence among injecting drug users in respondent driven sampling surveys in Bangkok and Chiang Mai, Thailand. International Conference on AIDS (18th: 2010: Vienna). 2010 July 18–23: abstract no. TUPE0311.
20. Malekinejad M, McFarland W, Vaudrey J, et al.. Accessing a diverse sample of injection drug users in San Francisco through respondent-driven sampling. Drug Alcohol Depend 2011; 118: 83–91.
21. Gallagher KM, Sullivan PS, Lansky A, et al.. Behavioral surveillance among people at risk for HIV infection in the U.S.: The National HIV Behavioral Surveillance System. Public Health Rep 2007; 122 (suppl 1): 32–38.
22. Hoel PG. Introduction to Mathematical Statistics. Ann Arbor, MI: Wiley Series in Probability and Statistics, the University of Michigan, 1984. ISBN 0471890456.
23. Brady JE, Friedman SR, Cooper HLF, et al.. Estimating the prevalence of injection drug users in the U.S. and in large U.S. metropolitan areas from 1992 to 2002. J Urban Health 2008; 85: 323–351. doi:10.1007/s11524-007-9248-5.
24. Mathers B, Degenhardt L, Phillips B, et al.. The global epidemiology of injecting drug use and HIV among people who inject drugs: A review. Lancet 2008; 372: 1733–1745. Epub 2008 Sep 23.
25. UNODC (2009). World drug report 2009. Geneva, Switzerland. Available at: Accessed June 11, 2010.
26. Wattana W, van Griensven F, Rhucharoenpornpanich O, et al.. Respondent-driven sampling to assess characteristics and estimate the number of injection drug users in Bangkok, Thailand. Drug Alcohol Depend 2007; 90: 228–233.
27. Caceres CF, Konda K, Segura ER, et al.. Epidemiology of male same-sex behaviour and associated sexual health indicators in low- and middle-income countries: 2003–2007 estimates. Sex Transm Infect 2008; 84 (suppl 1): i49–i56.
28. Johnston LG, Chen Y, Silva-Santisteban A, et al.. An empirical examination of respondent driven sampling design effects among HIV risk groups from studies conducted around the world. AIDS Behav. In Press.
29. Wejnert C, Pham H, Krishna N, et al.. Estimating design effect and calculating sample size for respondent-driven sampling studies of injection drug users in the United States. AIDS Behav 2012; 16 (suppl 1): 797–806. 1-10. doi:10.1007/s10461-012-0147-8 Key: citeulike:10375127.
30. Grassly NC, Morgan M, Walker N, et al.. Uncertainty in estimates of HIV/AIDS: The estimation and application of plausibility bounds. Sex Transm Infect 2004; 80 (suppl 1): i31–i38.
© Copyright 2013 American Sexually Transmitted Diseases Association