Three large-scale HIV related datasets were generated in 2006–2007 in India: National Sentinel Surveillance (which is a routine data collection exercise carried out every year) , the National Family Health Survey , and the Integrated Behavioural and Biological Assessment (IBBA). Each survey had separate objectives and methodologies, and covered different sets of participants. The IBBA was unique in India, and in the world, because of its size (over 25 000 respondents from populations most at risk of HIV), its diversity (in terms of sampling methods, questionnaires, languages and regions), and the network of institutions involved in completing it. The survey was spread out over 29 districts in six high-prevalence states and faced numerous logistical, technological, managerial and ethical challenges. Whereas the group-specific results are discussed in other papers, the overall design of the IBBA and some of the major methodological challenges and how they were addressed are described separately in this paper.
The IBBA round 1 was the first large-scale probability sample survey in India that included both behavioural and biological indicators among populations most at risk of the transmission of HIV, including female sex workers (FSW) and their clients, high-risk men who have sex with men (MSM), hijras (transgender individuals), injecting drug users (IDU), and long-distance truck drivers (LDTD). Round 1 of the IBBA was carried out over a 19-month period between November 2005 and June 2007, and was intended to serve as a baseline for evaluating the impact of the Avahan India AIDS initiative, a large HIV prevention programme supported by the Bill and Melinda Gates Foundation in six of the highest prevalence states in India, i.e. Andhra Pradesh, Karnataka, Maharashtra, Manipur, Nagaland and Tamil Nadu.
IBBA round 1 was implemented by five institutes of the Indian Council of Medical Research (the National Institute of Nutrition in Andhra Pradesh, the National Institute of Epidemiology in Tamil Nadu, the Regional Medical Research Centre in Manipur and Nagaland, the National Institute of Medical Statistics on the national highway), and was coordinated by the National AIDS Research Institute (NARI), Pune, the Indian Council of Medical Research's (ICMR) nodal institute for research in HIV/AIDS. Fieldwork was conducted by selected Indian professional research agencies. The Karnataka Health Promotion Trust (KHPT) was responsible for the IBBA in Karnataka and followed a similar methodology with some modifications (which are highlighted in the text of this paper). Overall technical support for the IBBA was provided by Family Health International (FHI).
The purpose of this paper is to describe the design and subsequent modifications to the IBBA protocol necessitated by field conditions and the large scale of the survey, and to discuss some of the methodological considerations when using these data to evaluate programme impact or to contribute more broadly to the characterization of the HIV epidemic in India.
Objectives of the integrated biobehavioural assessment
Consistent with HIV epidemic patterns in Asia, the majority of HIV transmission in India is driven by three high-risk behaviours: unprotected commercial sex, sharing of injection equipment, and unprotected sex between men. The majority of infections continue to occur among populations who engage in these behaviours and their immediate sexual partners, with men who buy sex and their female partners constituting the largest group of people living with HIV [3,4]. The Avahan India AIDS initiative of the Bill and Melinda Gates Foundation began in 2003 with one of its major goals being to reverse the trajectory of HIV in India by averting HIV infections in high-risk groups, bridge groups, and ultimately in the general population. To achieve this Avahan is implementing HIV prevention interventions in 83 districts among FSW and their clients, high-risk MSM, LDTD and IDU [5,6]. The geographical and target population focus of these interventions complements government's and other donors' interventions with the goal of providing HIV prevention services to over 80% of the high-risk populations in the districts.
The main objectives of the IBBA for programme evaluation purposes were threefold: (1) to measure the major outcomes of the Avahan India AIDS initiative by collecting behavioural, biological and programme coverage trend data in populations targeted by the interventions; (2) to provide an additional source of size estimates for populations targeted by the project in IBBA districts; and (3) to make information available for use in transmission dynamics models, and provide evidence of Avahan's impact [7–9].
Uniqueness of integrated biobehavioural assessment in relation to other data collection in India
Target populations studied in the IBBA were similar to those included in the national HIV serosurveillance and behavioural surveillance system for most at-risk populations [1,10–13]. The IBBA, however, collected a more comprehensive set of information at the district level and used community-based probability sampling methods to maximize external validity (i.e. representativeness) of the samples for the source populations in the concerned districts covered by the survey . In contrast, HIV serosurveillance is performed by sampling drop-in clients at sentinel sites , and the national behavioural surveillance system uses community-based probability sampling methods for behavioural data among high-risk groups at the state level. The sentinel sites for HIV serosurveillance are targeted intervention sites (i.e. sites where there are programmes for high-risk groups implemented by local non-governmental organizations; NGO). These sites, by definition, are limited to the subset of the source population who are beneficiaries of the NGO programme. They may therefore be less representative of the larger source community at a district level. The IBBA was heavily resourced to provide unbiased measures of the risk behaviours and levels of HIV and sexually transmitted infections (STI) in survey areas that may greatly influence the continuing spread of HIV in the country.
Materials and methods
The IBBA was intended to be a repeat cross-sectional survey designed to measure changes in key behavioural and biological indicators among selected populations over the life of the Avahan project. The first round of the IBBA covered 29 of the 83 districts where Avahan is being implemented and covered 25 162 respondents including FSW in 25 districts, high-risk MSM/hijras in 11 districts (plus one sample over five districts of hijras in the state of Tamil Nadu), IDU in five districts, clients in 12 districts, and LDTD on four route categories of the national highway (see Tables 1 and 2 and Figs 1 and 2 – survey groups by state and district).
Indicators measured by the integrated biobehavioural assessment
Round 1 of the IBBA focused on measuring proximate determinants (i.e. those factors most directly linked to HIV transmission), to contribute to the eventual impact evaluation of the programme . The survey differed from the usual second generation surveillance model [17,18], in that it collected both behavioural data and biological specimens on the same sample of individuals [19,20]. The proximate determinants studied included numbers and types of sexual partners, the number of sex acts by partner type, condom use by partner type, some measures of partner concurrency (multiple types of sex partners in the past one year), injection drug use, needle sharing, needle cleaning, medical injections (for some groups), other STI [syphilis, Neisseria gonorrhoeae, Chlamydia trachomatis, herpes simplex virus type 2 (HSV-2), on 10% of the samples], genital ulcers (on the subset with external ulcers who consented to examination), biological susceptibility (HIV serology indicating the proportion of HIV-negative individuals), and circumcision [21–23]. Hepatitis B and hepatitis C in IDU were also measured, and BED assays for early HIV infection were performed to estimate the incidence of HIV . Given the tendency of this assay to overestimate HIV incidence, however, which became more evident after the IBBA started , use of the incidence estimates was considered limited. The laboratory working group for the Office of Global AIDS Coordinator, a United States-based organization charged with leading the President's Emergency Plan for AIDS Relief, has recommended that BED capture enzyme immunoassay can be used to estimate HIV-1 incidence in cross-sectional serosurveys . The data must, however, be adjusted to account for the misclassification of individuals with long-term infection as recent infections. The group also recommends the use of the BED assay to study trends in the same populations. Sociodemographic variables (e.g. age, education, marital status) and other variables (e.g. sex work history, mobility and violence), were measured to help Avahan implementing agencies with programme fine tuning. An extensive section of the questionnaire was dedicated to measuring exposure to various intervention components (both Avahan and non-Avahan) to be used in estimating intervention coverage, and also to be used as multipliers for size estimation (see separate paper in supplement on this topic). Finally, because of the emphasis that Avahan interventions placed on addressing high-risk population vulnerability and related impact on risk behaviour, there was a limited section for measuring variables related to the strength of community mobilization.
Unit of analysis
The primary unit of analysis for each survey group in the IBBA was the district. There were several reasons for this. First, it offered the greatest utility for evaluation purposes as well as for characterizing the HIV epidemic in the districts where the Avahan intervention took place. The district corresponds (as well as possible) to the basic area of influence of an NGO (or group of NGO) implementing targeted interventions (both Avahan and non-Avahan). Second, being the main administrative/political unit within the states in India, the district was thought to be more manageable than larger geographical entities (such as states), in terms of sampling. Finally, districts are more likely to be epidemiologically cohesive units, with respect to factors such as the start time of the epidemic and pattern of transmission, which figure significantly in the ability to interpret trends.
Avahan operated under the assumption that they, along with other implementers, would achieve saturated (i.e. 80% or more) programme coverage of high-risk populations, at scale (across large geographical areas), and that the achievements of overlapping programmes would be synergistic, and would in fact go beyond areas of direct programme coverage. For these reasons, and in spite of large variations in the level of intended Avahan coverage in any given district, the IBBA was designed to capture samples that would be representative of entire districts. There were some exceptions to this, when the mapping and sampling of districts covering several hundred kilometres, with many small villages between the larger towns and cities with larger concentrations of high-risk populations, were not feasible. In these cases, the sampling universe was ‘truncated’ to represent those areas with the largest concentrations of high-risk groups (irrespective of intended coverage).
State-level estimates through the random selection of districts within a state or state-wide sample were not specifically considered in the sampling design of the IBBA. District-level point estimates (and trends) are intended to be the main units for which the estimates will be provided. With appropriate weighting and statistical adjustments, however, point estimates for larger epidemiologically cohesive areas are also possible.
Selection of districts
Districts for the IBBA were not selected randomly. The district selection was based on two key criteria: sociocultural region and the size of the FSW population, or, in the case of Manipur and Nagaland, the size of the IDU population. The sociocultural region is a categorization developed through the People of India Project , designed to reflect important cultural and ethnographic differences across the regions of India. The requirement was to have representation from each of the different sociocultural regions within a state to ensure heterogeneity in terms of social, economic and cultural characteristics. This was necessary to help set parameters for transmission dynamics modelling in different types of districts, an important consideration given that modelling was a key element of the overall evaluation design and the cost of collecting comprehensive data in all Avahan districts would have been prohibitive . The decision not to use probability sampling to select districts was in recognition of the enormous diversity across districts and states, with respect to the nature and timing of the epidemic, and the potentially misleading information that would be generated from generalizing across all of Avahan. Within each sociocultural region the criterion was to select the districts with the highest number of persons at high risk (FSW or IDU), based on whatever information was available from various estimates, including mapping by the Avahan partners. It is recognized that given the mobility in these groups, the size of the risk populations (FSW or IDU) is fluid, and attempts to enumerate them at any given point in time will yield different results. As a probability-based method was not being used to select districts, however, the primary concern was to select districts with the highest concentrations of at-risk populations. Apart from the selections within sociocultural regions in each state, all state capitals were included as IBBA districts, because of their importance in terms of concentrations of at-risk populations. East Godvari (in Andhra Pradesh) was also included because of the Avahan-funded model programme on techniques of community mobilization , and the desire to have extra data there.
Before the start of the IBBA, a presurvey assessment was conducted for each survey group in each state to gather information to help establish the appropriate definitions of the survey group, help establish sampling procedures (i.e. determine types of sites where members of the population congregated, and whether they could be identified and sampled from those locations), investigate available mapping information, learn about ‘gatekeepers’ (e.g. brothel owners, pimps, police), and potential related challenges to survey implementation, and to establish contact with local NGO for their input on these issues. This information was needed to finalize the methodology, especially eligibility criteria and the sampling approach, but also community preparation activities. Particularly important was the understanding about whether location-based sampling would be possible, or if not, whether respondent-driven sampling (RDS) was an option. The assessments were conducted by researchers skilled in qualitative methods and experience working with at-risk populations. Avahan state managers and lead implementing partners (grantees) also helped facilitate these assessments. In Karnataka, where extensive site-specific high-risk group estimates, based on an initial mapping and subsequent revalidation were available, these assessments were not part of the IBBA.
Survey populations for the IBBA followed the operational definitions and eligibility criteria described in Table 3. Definitions varied somewhat by state and sometimes by district. These differences were mainly to ensure that the source population captured in the survey corresponded as closely as possible to the local context and the definitions used by the Avahan lead implementing partners. Broadly speaking, FSW were defined as women aged 18 years or older who had exchanged sex for money at least once in the past one month. Sampling methods further refined this group to represent those who were accessible through specific types of venues (brothels or identifiable street-based sex solicitation points as well as lodges and homes). In districts where home-based sex work was common, considerable effort was put into mapping homes used as commercial sex venues. Women who solicited solely outside of these venues (e.g. in massage parlours, bars, via cell phones, through pimps or agents, etc.) may have been missed in the survey. Likewise, women who exchanged sex only for favours or gifts other than cash were not part of the target population for the survey. As bar-based sex workers were an important segment of the commercial sex industry in Mumbai, a separate survey of service bar women was conducted there. Given the sensitivities around the prohibition of bar-based solicitation, and the wave of bar closings during the year before the survey, the criteria of selling sex in the past one month was dropped, and the only criteria for participation was to be a female working in a service bar.
In the case of MSM, although the eligibility criteria was men aged 18 years or older who had sex (oral, anal or manual) with another man in the past one month (with some deviations by state), operationally this group represented the subset of men who were accessible at ‘cruising sites’, locations known to attract men in search of male sexual partners. These cruising sites included parks, beaches, public toilets, train stations, and other areas identified with the help of local MSM. The intention was to target the highest risk subset of MSM as the source population, but sampling in this way made it less likely that MSM who found partners only outside of the mapped ‘cruising sites’ (e.g. by internet, phone, newspaper, private party, or other) would be included in the survey. In Tamil Nadu, MSM were defined as only those who had exchanged sex for cash/kind with other men in the past one month, again sampled from ‘high-risk’ venues or ‘cruising sites’, but representing the highest risk subset of MSM targeted by the Avahan programme there. In Karnataka, MSM were defined as men/transgender individuals aged 18 years and over who had anal sex with another man/transgender individual in the past one month, and who were found at known MSM/transgencer ‘cruising sites’. Hijras constituted a separate survey population in Tamil Nadu, whereas they were included in the general MSM samples in the districts of Andhra Pradesh and Maharashtra. In Mumbai, hijras were excluded from the MSM survey, the intent being to conduct a separate survey for this population. This survey could not be completed, however, because of difficulties gaining access to the hijra community.
The IBBA for clients of sex workers, according to the operational definition, included only those clients present at mapped sites from the FSW sampling frame, who had bought sex within the past one month. Although this does not represent all clients of sex workers, it does represent a sizeable subset that was feasible to sample, given the challenges of sampling this population at all, i.e. the difficulties in identifying clients, the reluctance of clients to acknowledge that they bought sex, and the objections of sex workers who were concerned about disruption to their business. An added complexity was the need to recruit men before they had sex (for the subset of respondents who had sex on the day of the survey), because of the potential for contamination of urine samples for STI testing if the specimens were collected after sex. More details about the process of piloting and selecting sampling strategies for clients are included elsewhere in this supplement .
The IBBA for the national highway was intended to capture a sample of all LDTD. Operationally, this definition was narrowed to the industry definition of LDTD (which was also the definition used by the lead implementing partner working with truck drivers), i.e. traveling distances of 800 km or more on selected route categories of the national highway. The drivers were selected from a subset of transshipment locations (TSL), at the point of origin of their route, through brokers and transporters (which functioned as sampling sites). The point of origin was chosen because of the organization of trucking routes, which have start and end points at major TSL at either end of the route. Most drivers travelling over 800 km on these particular routes originate at one of these locations (as opposed to shorter distance drivers who may pass through, but not originate at these TSL, or not pass through them at all). The large TSL were hypothesized to have large congregations of FSW, and the separate route categories potentially to have different patterns of risk behaviour. Although the definition was somewhat restrictive, it was chosen (in consultation with Avahan's implementing partner and members of the transport industry) because it would allow sampling from the majority of LDTD on the selected route categories, while also allowing for a well-documented and systematic sampling approach.
For the IBBA with IDU, the inclusion criteria of injecting at least once in the past 6 months cast a very ‘wide net’, and had the effect of including infrequent as well as frequent injectors. This broad definition, combined with the non-venue-based RDS approach, may have resulted in capturing less frequent (i.e. less at-risk) injectors than might have been the case if a venue-based sampling approach had been used, so this must be kept in mind when interpreting findings.
Sample size calculations
As sampling was performed separately for each district, sample sizes were calculated to provide accurate estimates for each survey group at the district level. The target sample size was 400 per group, per district for FSW, MSM, IDU and clients of FSW, and 500 per route category for LDTD. In the case of hijras in Tamil Nadu, there was one sample of 400 for all five IBBA districts combined. The sample sizes were calculated to detect changes in key behavioural determinants between survey rounds within districts, i.e. consistent condom use with all clients in the past one month for FSW, consistent condom use with all non-regular sex partners in the past 3/6 months for MSM, consistent condom use with all commercial partners in the past one month for clients of FSW, and needle sharing at last injection for IDU. The size of 400 allowed for the detection of an absolute difference of 15% or more from the assumed value of 50%, with 95% confidence and 90% power. A design effect of 1.7 was assumed for cluster sampling and 1.5 for RDS, based on the best available information at the time.
Conventional cluster sampling, time-location sampling (TLS) and RDS were the methods used to select respondents. TLS and RDS have been used increasingly to obtain probability samples of hidden, hard-to-reach and/or mobile populations, especially in the context of HIV [20,30]. Conventional cluster sampling was used for brothel-based and home-based sex workers. TLS was used for non-brothel-based FSW (with the exception of FSW in Dimapur, Parbhani, and service bar-based FSW in Mumbai), MSM, clients of FSW, and truck drivers. These methods involved district-wide mapping of the sites where population members could be accessed, along with information about the hours of operation and a rough approximation of the number of eligible respondents available at different times of the day on different days of the week for the TLS samples. NGO were enlisted to assist with this process, and their site maps served as the starting point from which the research teams further evolved the list of sites for eventual sampling frame development. This was done through a process of site verification and extended listing. It should be noted that in Karnataka, the process was reversed, that is to say, the KHPT NGO verified the final list of sites and extended the list as appropriate (rather than the research teams doing so as in the other states).
The selection of clusters at the first stage of sampling was generally by probability proportional to size, based on expected measures of size from mapping to increase sampling efficiency, but with the recognition that time–location cluster sizes are not static, so adjustments for actual measures of size at the time of the survey would be factored into the analysis. Fieldwork was planned on the basis of cluster selection (by days of the week and hours of the day) taking into account logistical considerations, the number of available data collection teams, and placement of data collection points. For example, if a selected cluster was ‘weekday 15:00–17:00 hours’, the day of the week for that cluster would be assigned in a way that maximized field efficiency, in terms of travel of the data collection teams and placement of the data collection sites.
At the time of fieldwork, survey respondents were selected randomly from among all eligible respondents available during the fixed time interval specified for the selected cluster. This was generally done by making a quick listing at the site, using easily identifiable characteristics such as the colour of clothing. Supervisors were generally responsible for listing, selecting and approaching respondents for recruitment with the help of a community liaison. The community liaison staff was usually a member of the community being surveyed, who was not employed by any of the NGO working with the community, and who could help in establishing rapport with the community. He or she was not eligible to participate in the survey, but played an important role in establishing trust and rapport with respondents. This arrangement differed in Karnataka, where Avahan NGO outreach workers assisted with the listing process (e.g. the identification of respondents to be approached for potential recruitment into the IBBA), because they were seen to add value to the process and especially to increase the comfort level (and likely participation) of the respondents. Each of these approaches (i.e. integrating NGO workers or not) has merits and demerits. The exclusion of NGO workers as part of the survey team (outside Karnataka) was done in an attempt to minimize potential interdependence of the IBBA sample with NGO clientele. This approach may, however, have introduced other biases (e.g. the unwillingness of respondents to be identified or participate in the survey). In the case of Karnataka, the outreach workers were instructed to try to identify all eligible respondents, both those known to them and those not known to them. In fact the survey results did show that not all respondents were clients of NGO, but it is difficult to quantify the extent of potential bias involved with either of these approaches, because the IBBA did not look specifically at this question. After obtaining initial consent, the respondents were assigned to an interviewer for the more formal consent process (which generally took place after escorting the respondent to the data collection centre). In some cases, the interviewers remained at the data collection sites and respondents were escorted by either the supervisor or community liaison. The procedure varied depending on the field situation. All necessary information for the calculation of sampling probabilities and design effects was recorded during the fieldwork, so that appropriate weighting and calculation of standard errors could be done at the time of analysis (e.g. estimated and actual measures of size for clusters, number of respondents selected from each cluster, number of non-responders per cluster, etc.) .
For populations in which the presurvey assessment suggested that the population did not congregate at identifiable locations, or the proportion of members accessible at identifiable locations was insufficient to represent the larger group, and that the population was sufficiently networked, RDS was used . This included all IDU groups, one FSW group in Nagaland and three FSW groups in Maharashtra. Important methodological considerations for RDS included seed selection and seed addition, duration of fieldwork, coupon management and tuning, amount of incentives, and the number and location of RDS centres. The number of seeds for each RDS sample in the IBBA ranged between six and eight initially, and went as high as 30–40 by the end of recruitment in some locations in Maharashtra (see Table 4). Seeds were added when recruitment was slow, when new data collection centres were opened, when existing chains were not productive, or when recruitment became unbalanced in terms of the type of respondent. The length of data collection for the RDS surveys ranged between 5 and 12 weeks, depending on the speed of recruitment, with the upper limit being 12 weeks because of time and cost constraints. One RDS survey among service bar-based FSW in Thane (Maharashtra) was stopped after efforts to increase productivity failed after 3 months of fieldwork. Based on the lessons from Thane (i.e. that the incentive offered by the RDS survey was too low to ensure high participation considering the earning potential of the women, and the reluctance of the women to acknowledge selling sex in the past month because of legal sensitivities and stigma), the Mumbai RDS survey among bar-based women was subsequently carried out successfully. The two major changes that were put in place in Mumbai, which led to a more successful survey with much higher participation, were higher cash incentives (300 INR primary incentive and 100 INR secondary incentive versus 100 and 50, respectively, in the Thane survey), and relaxation of the eligibility criteria so that any woman working in a service bar was eligible to participate. The idea was to reduce the stigma of participation in the survey, by not requiring that the women acknowledge having sold sex in the past one month during the recruitment process, but with the understanding (from the NGO), that most women working in service bars do sell sex.
In terms of the representativeness of the RDS samples, RDS operates under the theory that recruitment chains, if allowed to run for a sufficiently long time, will attain a level of equilibrium that, once reached, will allow for a weighted analysis to be preformed, to obtain valid estimates for the source population. When time is limited, the quickest way to make sure the sample reaches equilibrium is to use a priori knowledge about the network structures of the source population to guide the selection of a group of diverse seeds that will hasten the equilibrium process. Obtaining a group of seeds that would unequivocally represent the different factions of the IDU or sex worker networks would have required a study of IDU and FSW networks that was more comprehensive than what was done during the presurvey assessment. Seed selection did, however, take into account such factors as differences in geographical area, ethnic group, language, NGO involvement, occupation, length of drug use (for IDU) and type of sex worker (for FSW). Calculation of post-hoc selection probabilities allowed for the weighting and calculation of sampling errors and confidence intervals during the analysis, using the RDSAT analysis tool version. 5.6 (which uses data on cross-recruitment patterns and network sizes recorded through the RDS coupon system, and as part of the RDS questionnaire) .
Community preparation was an integral part of the implementation of the IBBA and was carried out with the intent of understanding and addressing the concerns of stakeholders, gatekeepers and community members. The approach to community preparation arose from the general principle of community ownership promoted as a core Avahan value in programme implementation. Independent community advisory boards (CAB) and community monitoring boards (CMB) were formed for each survey group in each district. Typical CAB members included heads of NGO and NGO advisors, brothel owners, bar owners, community leaders and community members (of the group being surveyed), employers, police, and others. The CAB was convened in advance of the survey and provided a forum for local input to the survey teams on how to avoid adverse events. During survey implementation, the CAB helped address problems as and when they arose, in a community sensitive manner. The CMB comprised members of the communities being surveyed who visited the survey sites and reported any complaints, concerns, or problems to the CAB.
Whereas the primary purpose of the CAB and CMB was to protect the interests of the communities, the process of community preparation also played a role in quality assurance by helping remove obstacles to participation. The CAB and CMB functioned as a show of good faith to the community, which was presumed to increase trust and comfort levels without compromising the independence of the survey.
Ethical issues and consent process
Given the sensitivities surrounding the type of data collected in the IBBA, many measures were put in place to ensure the privacy, safety and protection of the respondents. Ethical clearances were obtained from FHI's Protection of Human Subjects Committee, and from the ethical committees of each participating ICMR institute. A comprehensive informed consent process was developed that allowed respondents to become fully informed and have all questions answered before agreeing to participate in the survey. Respondents were allowed to consent to the behavioural portion of the survey and opt out of the biological portion (with the exception of the RDS surveys) and, once the survey process began, they could discontinue at any time. For the RDS surveys, only respondents agreeing to both the biological and behavioural components of the survey were given coupons for recruitment in order not to break recruitment chains (with respect to the biological variables). Written consent was required before the interview process could begin. Other protective measures included oaths of confidentiality by all survey staff, and the development of harm minimization guidelines and specimen and data safety guidelines.
The opportunity to consult with a physician and receive an STI examination, syphilis results and treatment were the major benefits for individual respondents. The survey staff also provided referrals to STI clinics and voluntary testing and counselling (VCT) sites. Money was provided to most respondents as compensation for their time, and to cover transportation costs for obtaining syphilis results and follow-up care. ‘Gifts’ (compensation in kind instead of money) could, however, also be provided if more suitable (as suggested by the NGO or community). The incentive amounts were generally equivalent to a day's worth of lost wages for the respondent.
Laboratory issues and quality control
The handling of clinical specimens from the survey took place at several levels. At the data collection sites, blood, urine and genital ulcer swab specimens were collected, maintained at 4°C, and transported daily to district laboratories, where the serum was separated and aliquoted into three vials. Both urine and sera were stored at 4°C (assured with back-up generators) until being transported weekly to the state laboratories, which were located at each of the state ICMR institutes. Quantitative syphilis serology (rapid plasma reagin) was performed at the district laboratories and confirmatory testing (Treponema pallidum haemagglutination assay; TPHA) was performed on all reactive specimens at the state laboratory. After testing, syphilis results were returned to the research team for appropriate distribution to locally participating government and NGO clinics, which served as referral centres for IBBA respondents. All other laboratory tests were done at the state laboratories except for the BED capture enzyme immunoassay (EIA) HIV incidence assays (Calypte HIV 1 BED incidence EIA; Calypte Biomedical Corporation, Maryland, USA), and multiplex polymerase chain reaction for genital ulcers (chancroid, syphilis, HSV-2), which were performed at the central laboratory at NARI. All quality control testing was done at NARI. The tests at the state laboratories included TPHA, HSV-2 (antibody EIA), HIV (antibody EIA), hepatitis B (surface antigen EIA), hepatitis C (antibody EIA, RIBA) and N. gonorrhoeae and C. trachomatis (APTIMA nucleic acid amplification). Table 5 lists the laboratory test manufacturers. Randomly selected quality control aliquots of sera and urine were stored at −20°C at the state laboratories while awaiting transport to NARI and subsequent storage at NARI was also at −20°C. Necessary monitoring and checks were instituted to ensure that samples were transported in appropriate conditions. The proficiency of the laboratories was monitored using a structured quality assessment scheme and supportive supervision, with each laboratory being monitored by the laboratory level above. All quality control tests were performed at NARI on 10% of randomly selected sera, and on all N. gonorrhoeae/C. trachomatis positive urines plus 5% of randomly selected negative urines. A flow diagram of the entire laboratory component can be seen in Fig. 3.
Because this was a multisite survey carried out by multiple research institutions, generic versions of the protocols and questionnaires for each survey group were developed by a central team from ICMR and FHI with input from the Bill and Melinda Gates Foundation, Delhi, KHPT and the Avahan evaluation modelling team, and relying in large part on the results of the presurvey assessment. Field guidelines and training materials for both the behavioural and biological components were also developed by the central team. These were later adapted within each state for each survey group by the research agencies, who were responsible for translating, pretesting and revising the questionnaires with oversight by the responsible ICMR institute. In general, fieldwork proceeded group-wise within states, with FSW surveys being completed first, followed by MSM surveys and then client surveys. IDU and LDTD surveys were conducted in different locations and were carried out on separate timelines. Research agencies were responsible for hiring, training, and managing the field teams, which usually consisted of a supervisor, three to four interviewers, a laboratory technician, a doctor and a community liaison. This structure varied slightly in the case of the RDS surveys, in which a screener and a coupon manager were added to the teams. Sampling and recruitment for TLS surveys were generally handled by the team supervisors with help from the community liaison in identifying eligible respondents (see section on sampling). Once recruited, respondents underwent the following steps: informed consent, interview, collection of biological specimens (blood, urine and genital ulcer swabs if ulcers were present), voluntary examination by a doctor for clinical assessment and syndromic STI management (following Avahan STI treatment guidelines whenever possible), and STI and VCT referral. Syndromic treatment was provided on-site at the time of the survey and syphilis results and treatment were made available to respondents approximately 7–10 days after their participation in the survey at locally participating government or NGO clinics. This was done through a referral card system involving preprinted labels (with unique study numbers) that would link the participant to the appropriate syphilis test result. No names were recorded. Visits to VCT centres and the collection of HIV test results were facilitated for those respondents desiring assistance.
Data collection sites for TLS were set up in reasonable proximity to where sample recruitment was taking place (usually within 0.5 km), and consenting respondents were escorted to the sites by the survey team. Survey sites were equipped with secluded spaces for interviews, as well as for specimen collection and physical examinations. All sites were equipped with running water and electricity as well as toilet facilities. Often the data collection sites for FSW were in the brothels and lodges where sex took place. When this arrangement was not possible, the research agencies rented temporary space in nearby locations. For RDS surveys, data collection centres were placed in locations that would maximize convenience and ease of access for potential respondents.
Other quality control measures
Given the scale of the IBBA and the large number of partners involved, it was necessary to enact an elaborate system of supervision and monitoring with scheduled and unscheduled visits from all levels (FHI, ICMR institutes and research agency professionals). In the field, there was a system of checklists to make sure that all required tasks were completed on a daily basis. All interview schedules were checked by supervisors intermittently throughout the day and each night. Labels with unique study numbers for linking consent forms, questionnaires, biological specimens, etc. were preprinted. Logs for recording pick-up and delivery times as well as arrival temperature for all biological specimens were also kept. Doctors on each team oversaw the laboratory technicians and were responsible for all clinical and laboratory aspects of the survey. To document the field experience of each survey group in each district, team supervisors completed standardized process documents to describe local deviations from the generic protocol, the occurrence of adverse events, or other external events that may have impacted the survey results.
Data entry, data management and programmes used for analysis
All data were entered twice using CSPro (version 3.1) first by the research agency and then by the state-level ICMR institute. Double data entry for Karnataka was carried out by KHPT. Data reconciliation and initial cleaning was done by the ICMR institutes. A central level data management group was established at the National Institute of Epidemiology, which compiled the raw data from all the districts and prepared it for further analysis, including additional data cleaning, merging of behavioural and biological data, development of programmes to recode data, and calculation of standardized weights for use in weighted analysis. In the case of Karnataka, data reconciliation, cleaning and merging were done by KHPT. Preliminary analysis for TLS data was done with the Statistical Package for Social Sciences version 14, using the complex sampling module for cluster analysis. RDSAT (version 5.6.0) was used to analyse RDS data .
The results of the IBBA are presented elsewhere in this supplement [29,33–38]. The discussion here pertains to the methodological challenges involved in implementing the survey and using the data.
Large network of participating institutions and agencies
The sheer scale of the IBBA; a multisite undertaking implemented across diverse geographical, cultural and linguistic settings, under diverse epidemic conditions, with highly mobile and socially vulnerable populations, under the management of a range of collaborating institutions, made its implementation extremely challenging. The Avahan strategy of community involvement in all aspects of the programme including monitoring and evaluation, and the mandate for an independent evaluation of which the IBBA was a major component, presented a potential conflict of interest that had to be managed throughout the process. As in any survey, the research teams had first to overcome the natural distrust of communities to outsiders implementing the survey. Clarifying the role of the community and NGO in an independent survey required numerous discussions and meetings to arrive at a mutually satisfactory level of NGO and community ownership. The CAB were instrumental in this process.
Conducting the biological component of the survey without the benefit of a facility-based setting involved extensive laboratory and cold chain management. Transport, storage and testing of specimens required skilled manpower and involved the training and retraining of staff. Other operational issues included the assurance of an uninterrupted power supply to maintain the cold chain in settings with little infrastructure and substantial effort to procure, install and maintain instruments. The logistical challenges were especially difficult in the north-eastern states, as a result of the mountainous terrain, limited communication facilities and an unstable political environment.
Standardization and deviations from the protocol
One of the most difficult challenges of implementation was developing a set of survey protocols, guidelines and tools that was sufficiently standardized to alleviate central level quality concerns, while still being tailored enough to fit diverse local conditions. The decision to evaluate at the district level, which was built into the design of the IBBA, sometimes necessitated deviations from the standard protocol. These deviations were, however, more the exception than the rule, and the increased degree of internal validity and programme relevance afforded by these limited modifications was one of the strengths of the survey. The staggered nature of implementation in the first round also provided an opportunity to correct mistakes and incorporate lessons learned as the survey evolved.
Examples of these deviations included differences in the definition and eligibility criteria for high-risk MSM across states, minor changes to generic questionnaires after translation and pretesting in each state, the decision not to give cash incentives for some groups in some states, the decision to override the requirement to cover the entire district when it was not logistically feasible or practical, downsizing of questionnaires in the case of clients, and an increase in compensation and eligibility criteria for Mumbai bar-based sex workers. One area in which increased customization would have been very helpful was on the intervention/exposure section of the questionnaire. This part of the questionnaire was critical for quantifying exposure to HIV prevention interventions (Avahan and non-Avahan), and also for obtaining NGO multipliers to be used for size estimation. The IBBA provided a unique opportunity to use the multiplier method to obtain size estimates, but only if the instruments were sufficiently tailored, and the IBBA samples sufficiently independent of the NGO to avoid biases that could result in over/underestimation of population sizes. This meant customizing the instruments to be able to measure exposure to specific services offered by specific programmes in different districts. For example, when the multiplier was to be based on the number of persons who had visited a particular clinic for STI services, or the number of persons registered with a particular programme, it was critical to know exactly how the programme recorded clinic visits by individuals, and how they defined registration, so that an appropriately matched question could be included in the IBBA. The same set of information was also needed to measure programme coverage accurately. So this was an area that required more flexibility on the instruments than what was achieved during the first round.
Data use challenges
Assessing adherence to protocols
The intentional deviations from the generic protocol discussed above were meant to improve the validity and utility of the IBBA within the districts. There were, however, other unintentional deviations that occurred when unexpected difficulties were encountered in the field. This was particularly true in the area of sampling, in which problems such as incomplete sampling frames, inability to follow sampling protocols as written, and challenges in identifying and verifying eligible respondents could have produced unknown biases in the results. As in all surveys, diligent supervision and monitoring were maintained to keep such unknown biases to a minimum, but avoiding all such problems is virtually impossible.
An important aspect of the IBBA was the documentation of implementation for each group in each district. The field teams were responsible for recording information on specific aspects of how the protocol was implemented, such as the final eligibility criteria, the precise boundaries of the survey universe, details of the sampling plan (method for selecting primary sampling units and respondents), and descriptive information on events taking place during the field operation, particularly those that might have affected participation in the survey. The team also recorded information about the proceedings of the community advisory board to explain concerns that arose during survey implementation, and subsequent actions taken by the survey teams to resolve any problems.
The main objective of the survey was to track changes in proximate determinants at the district level. Sample sizes were calculated to allow for precise estimates and measures of change in selected proximate determinants within districts. This was done keeping in mind the variation in programmes as well as in epidemic patterns across districts, some of which are quite large. Aggregation of data across districts and/or across states can be performed to gain statistical power, or to inform the programme at a macro level. If data are aggregated across districts, such aggregation is best limited to districts in similar epidemic phases. Inferences can be made to IBBA districts for the state, if done with appropriate weighting, and strong caveats about any differences that may be masked by the aggregation. Inferences to non-IBBA districts (either Avahan or non-Avahan) would be subject to potential bias because districts were not randomly selected.
There is scope for exploring relationships between variables in a single round (i.e. understanding factors that are associated with proximate determinants), through multivariate or multilevel modelling. It may be difficult, however, to interpret the relationships in a cross-sectional survey because of the unknown ‘direction of causality’, when linking biological outcomes and behavioural risk factors . The IBBA was designed to measure changes in proximate determinants over time at the community level, rather than to provide insights into the factors responsible for those changes (e.g. underlying determinants) in individuals. Although exploring those factors is very important, it requires a model describing the specific relationships between underlying and proximate determinants, and developing instruments designed to examine those relationships more specifically [16,39]. On the other hand, the IBBA is well equipped to measure levels of proximate determinants over time, and relate them to programme activities, as measured through process data, and corroborated by information from the exposure section of the IBBA. It is also well equipped to examine trend information on HIV prevalence and the behaviours that spread it (proximate determinants) to build up an informative picture of changes over time, and the factors that influence these changes, using all available data sources [40–42].
The IBBA constitutes a very rich source of data for characterizing the epidemic in India and evaluating the Avahan programmatic response over time. Even with its caveats, the IBBA still provides a picture of the HIV situation among populations that are so critical to the curtailment of HIV spread in India, that it is necessary to obtain data about them, even if it is imperfect. Avahan views itself as a ‘living’ programme whose evaluation activities are also living. In this spirit, the second round of the IBBA will probably include more specific measurements related to the programme's conceptual framework, thereby enhancing its evaluative capabilities to look at factors responsible for observed changes in risk behaviour.
Sponsorship: Support for this study was provided by the Bill and Melinda Gates Foundation by a grant to Family Health International.
The views expressed herein are those of the authors and do not necessarily reflect the official policy or position of the Bill & Melinda Gates Foundation.
Conflicts of interest: None.
1. National Institute of Health and Family Welfare, National AIDS Control Organization. Annual HIV sentinel surveillance country report
. New Delhi: Ministry of Health and Family Welfare; 2006.
2. International Institute for Population Sciences, MACRO International. National family health survey (NFHS-3)
. Vol. 1. India: International Institute for Population Sciences, MACRO International; 2005–2006.
3. Commission on AIDS in Asia. Redefining AIDS in Asia: crafting an effective response
. New Delhi: Oxford University Press; 2008.
5. Chandrasekaran P, Dallabetta G, Loo V, Rao S, Gayle H, Alexander A. Containing HIV/AIDS in India: the unfinished agenda. Lancet Infect Dis 2006; 6:508–521.
6. Bill & Melinda Gates Foundation. Avahan – the India AIDS initiative: the business of HIV prevention at scale
. New Delhi, India: Bill & Melinda Gates Foundation; 2008.
7. Boily MC, Lowndes C, Vickerman P. Evaluating large-scale HIV prevention interventions: study design for an integrated mathematical modeling approach. Sex Transm Infect 2007; 83:582–589.
8. Williams JR, Foss AM, Vickerman P. What is the achievable effectiveness of the India AIDS initiative intervention among female sex workers under target coverage? Model projections from Southern India. Sex Transm Infect 2006; 82:372–380.
9. Chandrasekaran P, Dallabetta G, Loo V, Mills S, Saidel T, Adhikary R, et al
. Evaluation design for large-scale HIV prevention programmes: the case of Avahan, the India AIDS initiative. AIDS 2008; 22(Suppl. 5):S1–S15.
10. National AIDS Control Organization. National baseline high-risk and bridge population behavioural surveillance survey. Part I: Female sex workers and their clients
. New Delhi, India: NACO; 2001.
11. National AIDS Control Organization. National baseline high-risk and bridge population behavioural surveillance survey. Part II: Men who have sex with men and injecting drug users
. New Delhi, India: NACO; 2001; 101.
12. National AIDS Control Organization. National behavioural surveillance survey (BSS): Female sex workers (FSWs) and their clients
. New Delhi, India: NACO; 2006.
13. National AIDS Control Organization. National behavioural surveillance survey (BSS): Men who have sex with men and injecting drug users
. New Delhi, India: NACO; 2006.
14. Family Health International. Repeated behavioral surveillance surveys: guidelines for repeated behavioral surveys in populations at risk of HIV
. New Delhi, India: Family Health International; 2000.
15. National AIDS Control Organization. Operational guidelines for HIV sentinel surveillance
. New Delhi, India: NACO; 2007.
16. Boerma JT, Weir S. Integrating demographic and epidemiological approaches to research on HIV/AIDS: the proximate determinants framework. J Infect Dis 2005; 191(Suppl. 1):S61–S67.
17. Joint United Nations Program on HIV and AIDS, World Health Organization. Second generation surveillance for HIV and AIDS: the next decade
. Geneva: WHO Library Cataloguing-in-Publication Data; 2000.
18. Joint United Nations Program on HIV and AIDS, World Health Organization. Initiating second generation surveillance HIV surveillance systems: practical guidelines
. Geneva: WHO Library Cataloguing-in-Publication Data; 2002.
19. Joint United Nations Program on HIV and AIDS, World Health Organization. The pre-surveillance assessment: guidelines for planning serosurveillance of HIV, prevalence of sexually transmitted infections and the behavioral components of second generation surveillance
. Geneva: WHO Library Cataloguing-in-Publication Data; 2005.
20. Zaba B, Slaymaker E, Urassa M, Boerma JT. The role of behavioral data in HIV surveillance. AIDS 2005; 19(Suppl. 2):S39–S52.
21. Wasserheit JN. Epidemiological synergy. Interrelationships between human immunodeficiency virus infection and other sexually transmitted diseases. Sex Transm Infect 1992; 19:61–77.
22. Laga M, Manoka A, Kivuvu M. Non-ulcerative sexually transmitted diseases as risk factors for HIV-1 transmission in women: results from a cohort study [see Comments]. AIDS 1993; 7:95–102.
23. Garnett GP, Anderson RM. Strategies for limiting the spread of HIV in developing countries: conclusions based on studies of the transmission dynamics of the virus. J Acquir Immune Defic Syndr Human Retrovirol 1995; 9:500–513.
24. Parekh BS, Kennedy MS, Dobbs T. Quantitative detection of increasing HIV type 1 antibodies after seroconversion: A simple assay for detecting recent HIV infection and estimating incidence. AIDS Res Human Retroviruses 2002; 18:295–307.
25. UNAIDS Reference Group on Estimates Modelling and Projections. Statement on the use of the BED-assay for the estimation of HIV-1 incidence for surveillance or epidemic monitoring
. Geneva: UNAIDS; 2006.
27. Singh KS. Conceptual framework and methodology.
In: People of India series, 1993–1998. Anthropological survey of India
. Calcutta, India: Anthropological Survey of India, 1992 (72 vols) (National Series Vol. 1); pp. 29–33.
28. Blankenship K, and the Parivatan Team. Results of a cross-sectional survey of female sex workers in Rajahmundry, Andhra Pradesh
. Center for Interdisciplinary Research on AIDS; New Haven: Yale, 2007.
29. Subramanian T, Gupte MD, Paranjape RS, Brahmam GNV, Ramakrishnan L, Adhikary R, et al
. HIV, sexually transmitted infections and sexual behaviour of male clients of female sex workers in Andhra Pradesh, Tamil Nadu and Maharashtra, India: results of a cross-sectional survey. AIDS 2008; 22(Suppl. 5):S69–S79.
30. Magnani R, Sabin K, Saidel T, Heckathorn D. Review of sampling hard-to-reach and hidden populations for HIV surveillance. AIDS 2005; 19(Suppl. 2):S67–S72.
31. Heckathorn D. RDS II: deriving valid population estimates from chain-referral samples of hidden populations. Social Problems 2002; 49:11–34.
33. Pandey A, Benara SK, Roy N, Sahu D, Thomas M, Joshi DK, et al
. Risk behaviour, sexually transmitted infections and HIV among long-distance truck drivers: a cross-sectional survey along national highways in India. AIDS 2008; 22(Suppl. 5):S81–S90.
34. Ramesh BM, Moses S, Washington R, Isaac S, Mohapatra B, Sangameshwar BM, et al
. Determinants of HIV prevalence among female sex workers in four south Indian states: analysis of cross-sectional surveys in 23 districts. AIDS 2008; 22(Suppl. 5):S35–S44.
35. Mahanta J, Medhi GK, Paranjape RS, Kholi A, Roy N, Akoijam B, et al
. Injecting and sexual risk behaviours, sexually transmitted infections and HIV prevalence in injecting drug users in three states in India. AIDS 2008; 22(Suppl. 5):S59–S68.
36. Reza-Paul S, Beattie T, Syed HR, Venukumar KT, Venugopal MS, Fathima M, et al
. Declines in risk behaviour and sexually transmitted infection prevalence following a community-led HIV preventive intervention among female sex workers in Mysore, India. AIDS 2008; 22(Suppl. 5):S91–S100.
37. Brahmam GNV, Kodavalla V, Rajkumar H, Rachakulla HK, Kallam S, Myakala SP, et al
. Sexual practices, HIV and sexually transmitted infections among self-identified men who have sex with men in four high HIV prevalence states of India. AIDS 2008; 22(Suppl. 5):S45–S57.
38. Vadivoo S, Gupte MD, Adhikary R, Kohli A, Kangusamy B, Joshua V, et al
. Appropriateness and execution challenges of three formal size estimation methods for high-risk populations in India. AIDS 2008; 22(Suppl. 5):S137–S148.
39. Diez-Roux A. Bringing context back into epidemiology: variables and fallacies in multilevel analysis. Am J Public Health 1998; 88:216–221.
40. Mills S, Saidel T, Magnani R, Brown T. Surveillance and modelling of HIV, STI, and risk behaviors in concentrated HIV epidemics. Sex Transm Infect 2004; 80:ii57–ii62.
41. Mills S. Back to behavior: prevention priorities in countries with low HIV prevalence. AIDS 2000; 14(Suppl. 3):S267–S273.
42. Rehle T, Lazarri S, Dallabetta G. Second-generation HIV surveillance: better data for decision-making. Bull WHO 2004; 82:121–127.
© 2008 Lippincott Williams & Wilkins, Inc.