The HIV-1 epidemic in China was initially introduced by means of injection drug users (IDUs) in 1989.1,2 Guangxi province did not report any cases of HIV infection until 1996.3 Since then, HIV prevalence has been increasing dramatically among IDUs within the province.3-6 Two separate epidemics of HIV-1 have converged in the southern province of Guangxi following the heroin traffic routes from the west through Yunnan province and from the south through Vietnam.5,7 In the city of Pingxiang, IDUs are infected with subtype E (CRF01_AE), and reports from Vietnam show a close relationship between these viruses and the increase of HIV-1 prevalence in IDU populations.5,8-10 Conversely, in the central cities of Nanning and Binyang and the western cities of Baise, Tianyang, and Tiandong, IDUs were infected predominantly with CRF08_BC.6,11,12
To gain a greater understanding of the evolution of the HIV-1 epidemic in Guangxi, we examined the virologic and epidemiologic data in IDUs in Binyang and Pingxiang, major urban areas along 2 separate drug routes in the province. Virologic markers included viral load, subtype, and intersubject diversity. Portions of the gag, pol, and env genes were amplified to look for possible recombinants and provide regions of low and high variation within the virus to aid in making phylogenetic inferences. These molecular epidemiology data provided an opportunity to describe how the virus was introduced and the direction of its dissemination among the IDUs in Guangxi.
MATERIALS AND METHODS
Subjects were recruited from local heroin detoxification centers in Binyang and Pingxiang (n = 317 and n = 265, respectively) and were enrolled into a longitudinal cohort between January 2000 (Binyang) and September 1999 (Pingxiang). Enrolled participants received pretest counseling, signed an informed consent form, and were interviewed before phlebotomy. Demographic and behavioral information was obtained during the interview. Blood samples were obtained from drug users between September (Binyang) and October (Pingxiang) of the year 2000 for serologic tests and were stored at −70°C. Subjects who seroconverted between an earlier visit (April 2000 for Pingxiang and January 2000 for Binyang) and the visit investigated here were classified as “recently infected.” This study was approved by institutional review boards in China and the United States.
HIV-1 Serology and Viral Load Testing
All samples were tested for HIV antibody by enzyme-linked immunosorbent assay (ELISA; Organon Teknika, Boxtel, The Netherlands). All HIV-1 ELISA-positive samples were confirmed using an HIV-1/2 Western blot immune assay (Gene Laboratory, Singapore). All available samples from seropositive subjects were tested for HIV-1 viral loads by the Amplicor HIV-1 monitor test, version 1.5 (Roche Diagnostics Corporation, Indianapolis IN) according to manufacturer's protocol.13
Hepatitis B and C Serology
Positivity for hepatitis B surface antigen (HBsAg) and antibody to hepatitis B surface antigen (HBsAb) was determined by a hepatitis B virus (HBV) ELISA (Xiamen Xinchung Scientific, Xiamen, China). Hepatitis C antibody was analyzed using the hepatitis C virus (HCV) ELISA Test System, version 3.0 (Ortho Diagnostic Systems, Raritan, NJ).
RNA Extraction, Purification, and Nested Reverse Transcriptase Polymerase Chain Reaction of gag-pol and env Regions
Total RNA was extracted from 30 μL of the sera using the QIAmp Viral RNA Mini Kit (Qiagen, Valencia, CA) according to the manufacturer's protocol, with the following modifications. The sera sample was brought to a volume of 140 μL with the addition of 110 μL of phosphate-buffered saline (PBS) before its application to the column. The final elution was performed with 50 μL of diethyl procarbonate-treated water in a tube that contained 1 μL of 100 U of RNAse inhibitor (Rnasin; Promega, Madison, WI). A reverse transcriptase polymerase chain reaction (RT-PCR) was performed using the Qiagen One Step RT-PCR system according to manufacturer's protocol, with a 50-μL final mixture containing 10 μL of the purified RNA. The primers for gag-pol GAG-POL-F1, 5′-GTCCAAAATGCRAAYCCAGA-3′ (nt 1756-1775) and GAG-POL-R1 5′-TGGAGYTCATAHCCCATCCA-3′ (nt 3234-3253) or env C2V5-F1, 5′-CTCCAGCTGGTTWT-GCRATT-3′ (nt 6880-6899) and C2V5-R1 5′-GCCTGTACCGTCAGCGTTAT-3′ (nt 7827-7846) were used in separate reactions with the same conditions: 50°C for 30 minutes, 95°C for 15 minutes, 25 cycles at 94°C for 30 seconds, 55°C for 30 seconds, and 72°C for 1.5 minutes. Five microliters of the first-round PCR product was used in the second-round reaction in a 50-μL reaction mixture containing 2 mM of MgCl2, 0.4 mM of deoxyribonucleoside triphosphate, 1 × PCR buffer, 2.5 U of HotStar Taq DNA polymerase (Qiagen), and 0.8 μM of internested primers GAG-POL-F2 5′-ACAGCATGTCAGGGAGTGG-3′ (nt 1831-1849) and GAG-POL-R2 5′-ATTGCTGGTGATCCTTTCCA-3′ (nt 3006-3025) for the gag-pol region or C2-V5-F2(inter) 5′-CAGCTGGTTWTGCGATTCTAA-3′ (nt 6883-6903) and C2-V5-R2(inter) 5′-RTYYCCTCCTCCAGGTCTGA-3′ (nt 7627-7646) for the env product. Cycling conditions were 94°C for 15 minutes, followed by 35 cycles at 94°C for 30 seconds, 61.6°C (annealing temperature for primers GAG-POL-F2/polR2) or 64°C (annealing temperature for C2-V5-F2/R2) for 30 minutes, and 72°C for 1.5 minutes.
DNA Purification and Sequencing and Alignment
Polymerase chain reaction products were purified for sequencing using the QIAquick PCR purification kit (Qiagen) according to the manufacturer's protocols. Sequencing of PCR products was performed with an automated sequencer (PRISM automated sequencer, version 3100; ABI, Foster City, CA). For internal control, a random selection of 25% of the samples was reamplified, sequenced, and analyzed. Sequences were aligned with Clustal, version 1.81, and optimized by hand using the BioEdit program, version 18.104.22.168
Phylogenetic trees were generated with PHYLIP version 3.572c15 using DNADIST with the maximum likelihood model and a transition-to-transversion ratio of 2.0.16 Bootstrap confidence intervals were calculated by randomly permuting the sequence alignment 1000 times with SEQBOOT, trees were generated as described previously, and the consensus topology was derived by the use of CONSENSE.17 VarPlot was used to calculate maximum likelihood, nonsynonymous distance (dN), and synonymous distance (dS) in a “sliding window” of nucleotide sequence,18 incorporating the method of Nei and Gojobori to generate dN and dS.19 Regions of greatest diversity were analyzed separately by phylogenetic analysis, and outlying sequences were tested for recombination using SIMPLOT.20 Nucleotide positions are in relation to HXB2 using the HIV numbering engine, and reference sequences for different HIV-1 group M subtypes were obtained from Los Alamos (http://hiv-web.lanl.gov/seq-db.html). The sequences generated for this article have the accession numbers AY635691 to AY635762.
The intersubject distances for each subtype were determined for the portions of the gag; protease; reverse transcriptase; and C2, V3, C3, and V4 envelope regions. All pair-wise distances were generated with DNADIST, using the maximum likelihood option. The lower triangular matrix was imported into Microsoft Excel to obtain median and standard deviation (SD) values.
Using a Student t test, Microsoft Excel was used to compare the log-transformed viral loads between cities and subtypes. Univariate logistic regression models utilizing a Fisher exact test were used to estimate demographic differences between cities and HIV status. Differences in HIV incidence rates were calculated using a normal theory test. Linear regression using STATA was used on the intersubject distances because of the interdependence of the pair-wise comparisons generated by the distance matrix.
Heroin users (N = 582) were prospectively enrolled between September and October 2000; they were mostly male (96%), single (75%), and uneducated (>91% without a high school education), with a median age of 25 years (SD = 5.9 years, range: 19-49 years). The distribution of ethnicities differed by city; in Pingxiang, 81% were Zhuang, and in Binyang, 94% were Han. HCV seroprevalence was 75%, and HBV positivity (HBsAg and anti-HBsAg) was 60%. In September and October 2000, 126 (22%) subjects were HIV-positive, including 32 recently infected individuals. The HIV-1 prevalence was 25% in Pingxiang and 19% in Binyang, with the ethnicities in each city being proportionally represented. Although incidence rates differed between Binyang and Pingxiang (8.0 vs. 5.2 infections per 100 person-years, respectively), these findings were not significant (P = 0.363).
Viral Load Testing
Viral load testing was performed on 106 of 126 HIV-1-seropositive subjects. Ninety-nine of 106 samples tested had detectable viral loads. The median viral load for the cohort was 3.4 × 104 copies/mL. There was no statistically significant difference in viral loads between cities or by subtype. Removal of the recent seroconverters (n = 22) from the analysis did not change this finding. Finally, there was no statistical difference in viral load (P = 0.33) between those with risks for sexually acquired HIV (n = 12) versus those with purely parenteral risks for having acquired the infection (n = 86).
Reverse Transcriptase Polymerase Chain Reaction of gag/pol and env
Ninety-two of 99 subjects with detectable viral loads had sufficient volumes for RNA extraction. Of the 92 samples extracted, 72 were positive for gag/pol and 75 were positive for env. Sixty of 92 samples were concordantly positive by both sets of primers. Fifteen samples were env-positive but gag/pol-negative, and 12 samples were gag/pol-positive but env-negative. Five samples were negative for both primer sets, and these had viral loads in the range of 103.
Phylogenetic Analysis of gag/pol and env Regions
All positive PCR products were sequenced. Of the 72 gag/pol sequences and 75 env sequences, only 58 env and 66 gag/pol sequences provided data for the entire amplicon. A total of 47 subjects had sequence data for both gag/pol and env data, whereas 19 had sequence data for gag/pol alone and 11 had sequence data for env alone. Partial sequence data, which were used to assign a subtype to a subject, were obtained from an additional 2 gag-pol and 8 env amplicons (data not shown but submitted to Genbank). A total of 84 subjects were assigned a subtype. The distribution of subtypes was regionally based (Fig. 1). Thirty of 34 subjects infected with CRF01_AE were from Pingxiang, whereas 48 of 50 CRF08_BC-infected subjects were from Binyang. Interestingly, the 3 CRF08_BC-infected subjects from Pingxiang were recently infected. The 2 subjects from Binyang infected with CRF01_AE were chronically infected individuals who had a motif sequence more closely related to Thai sequences from the early 1990s (92TH0015 and THCM240) than to the rest of the sequences found in Pingxiang. Subject 90 from Pingxiang presented with a possible recombinant virus having a CRF01_AE gag/pol and a CRF08_BC env sequence.
Amino Acid Analysis of Protease and Reverse Transcriptase
Consistent with a population naive to antiretroviral therapy, protease and reverse transcriptase regions did not demonstrate sequence motifs indicative of antiretroviral resistance, except for the following mutations. All subtype CRF01_AE sequences contained a methionine-to-isoleucine mutation at position 36, a minor ritonavir resistance mutation. Isoleucine at this position is seen frequently in CRF01_AE sequences, however. One CRF01_AE sequence contained a valine-to-alanine mutation at position 82, which is associated with resistance to multiple protease inhibitors.
Amino Acid Analysis C2-V4 Envelope
The translated amino acid sequences of envelope C2-V4 are shown in Figure 2. The CRF08_BC sequences are consistent with data previously generated on this epidemic, showing little variation in V3 and a great deal of length variation in V4.11 The median length of the amino acid length polymorphism in the CRF08_BC sequences was 6 (range: 0-15), whereas that of the CRF01_AE sequences was 0 (range: 0-5). The V3 loop of the CRF08_BC sequences was conserved with the dodecapeptide RIGPGQTFYATG21 present in 97% of samples. The CRF08_BC envelope sequences are clearly of Indian origin, with a frequency of the residues associated with Indian subtype C22 (HXB2 location, amino acid residue, frequency) as follows: 290 Q 0.79, 335 K 0.31, 336 D 0.59, 340 E 0.92, 363 S 1.00, 415 G 0.95, 429 E 0.97, and 440 E 0.97. All CRF01_AE samples sequenced bore the EV phenotype (a valine residue 12 amino acids downstream from the V3 loop) consistent with the previous data from Guangxi, China and North Vietnam.5,10 The EV phenotype associated with IDUs from Pingxiang and the North Vietnamese provinces of Quang Ninh and Lang Son increased from 92% in 199810 to 100% in the current study. None of the CRF01_AE sequences from Pingxiang bore the EM phenotype (a methionine immediately before the GPGQ core of the V3 loop). The frequency of the EMV phenotype (a combination of EM and EV phenotypes) in Pingxiang increased to 34% from the 4% seen in a previous study.10
A trend (P = 0.06) was seen in the number of N-linked glycosylation sites in the V4 region of env between those CRF01_AE-infected individuals in Pingxiang with risk for sexually acquired HIV (n = 5, median = 4, range: 3-4) compared with those with purely parenteral risks (n = 5, median = 3, range: 2-4). Risks for sexually acquired HIV were defined as being HIV-positive but HCV-negative and/or those who admitted to having sex “for reasons other than love.” Because parenteral transmission of HCV is 10-fold higher than that of HIV23 and HCV-negative HIV-positive subjects are almost nonexistent in other intravenous drug-using cohorts,24 these individuals are far more likely to have obtained their HIV infection through sexual transmission. Of those with sexual risks, 4 of 5 individuals had 4 N-linked glycosylation sites, whereas 7 of 15 parenterally infected subjects had 4 N-linked glycosylation sites in the V4 loop (see Fig. 2).
The protease and reverse transcriptase regions consistently showed greater variation in CRF01_AE versus CRF08_BC (data not shown). Analysis of the envelope gene demonstrated greater variation for CRF01_AE than for CRF08_BC, with the most pronounced difference being at V3. The C3 region held as much variation as V4 and V3. Sequence data from the C3 (aa codons 100-130) and V4 (aa codons 150-180) regions were analyzed separately by phylogenetic analysis to look for potential recombinant sequences that could contribute to the high degree of variation found in these regions. Subjects 419 and 430 were outliers in the V4 tree, but review of the sequence data revealed a large number of indeterminate bases in subject 430 and a repeated-sequence motif in subject 419, although the sequence data were otherwise consistent with CRF08_BC sequences.
The Δd or differences in dN and dS distances (dN−dS) were mostly negative for protease and reverse transcriptase consistent with regions of biologic constraint (Fig. 3A, B). On the other hand, the envelope gene displayed a number of regions with a positive Δd associated with immune selection (see Fig. 3C). For AE, these regions included V3 and C3 and were most pronounced in V4. In contrast, the CRF08_BC sequences only had a positive Δd in C3 and V4.
Median intersubtype variation for the CRF01_AE and CRF08_BC differed by genetic region (Fig. 4). For gag, protease, and reverse transcriptase, the median variations were 2.05 (±1.78), 1.37 (±1.43), and 1.16 (±1.27) for CRF01_AE and 1.02 (±0.81), 0.34 (±0.68), and 0.69 (±0.55) for CRF08_BC, which were significantly different for all 3 regions (P = 0.001, P = 0.002, and P < 0.001, respectively). The median intersubject distances from the envelope regions C2, V3, C3 and V4 were 3.23 (±1.26), 4.99 (±2.22), 7.91 (±4.14), and 7.78 (±5.10) for CRF01_AE and 1.61 (±0.99), 0.00 (±0.12), 4.95 (±2.18), and 4.77 (±7.25) for CRF08_BC, which were significantly different for C2, V3, and C3 but not for V4 (P < 0.001, P < 0.001, P < 0.001, and P = 0.280, respectively). These distances were substantiated with greater branch lengths between CRF08_AE sequences in both the gag/pol and env trees.
Statistical Differences Between Cities and HIV Serostatus
HIV-positive drug users differed from HIV-uninfected drug users in Binyang, because all HIV-positive individuals were HCV-positive (100% vs. 74%; P < 0.0001) and admitted to more injection drug use (93% vs. 73%; P < 0.0001). HIV-positive drug users from Pingxiang were also more likely to be infected with HCV (96% vs. 76%; P < 0.0001) and admitted to more intravenous drug use (98% vs. 86%; P < 0.005) than HIV-negative individuals. Additionally, the HIV-positive subjects from Pingxiang were more sexually active (90% vs. 72%; P = 0.005), purchased their drugs from nonlocal locations (33% vs. 20%; P < 0.05), and had been injecting heroin for a longer period of time than HIV-uninfected individuals (52% vs. 26% for >4 years; P < 0.0001; data not shown). HIV-positive drug users from Pingxiang differed from HIV-infected drug users in Binyang in that they were younger (66% vs. 41% <35 years old; P < 0.01), had injected heroin longer (52% vs. 25% for >4 years; P < 0.05), had sex for reasons other than love (36% vs. 6%; P < 0.001), and did not purchase their heroin locally (31% vs. 6%; P < 0.001; Table 1).
This study reveals 2 distinct HIV epidemics in Guangxi province in China. Centrally in Binyang, the HIV epidemic is dominated by CRF08_BC, which is spreading in a manner consistent with a parenterally based epidemic. The intersubject variation of V3 is extremely low, and epidemiologic evidence supports confined routes of transmission in this epidemic. All HIV-infected subjects are infected with HCV. There is low admission to sex for reasons other than love, indicative of low to infrequent use of commercial sex workers. Almost all the drug users obtain their drugs locally. Although there were 2 subjects infected with a CRF01_AE strain in Binyang, these sequences were different from the CRF01_AE epidemic in Pingxiang.
The HIV epidemic in the city of Pingxiang is not consistent with a strictly parenterally based epidemic. The intersubject viral variation in V3 is greater than that seen in other IDU-predominant epidemics.9,25 Epidemiologic findings support the molecular evidence, with a potential of sexual transmission and multiple introductions of the HIV-1 epidemic into the drug users of Pingxiang. Additionally, the increase of the EMV phenotype from 4% in 19985 to 34%, suggests the migration or expansion of this southern variant in the north. Finally, further evidence of multiple introductions of the HIV-1 epidemic into the drug users in Pingxiang comes from the presence of the CRF08_BC strain in 3 newly infected individuals and the presence of a recombinant (CRF01_AE gag-pol and CRF08_BC env) in a chronically infected individual.
The extreme lack of variation in V3 that is seen in the CRF08_BC-infected subjects in Guangxi described here and in a previous study11 gives rise to the possibility that this region is a genetic bottleneck constrained for functional reasons as a result of cell tropism or other factors. Previous studies of a subtype B epidemic in the Netherlands25 demonstrated a log lower dN in the parenterally infected subjects compared with the homosexually infected subjects. Less selection pressure during transmission could contribute to a slower increase of population-based heterogeneity over time.
In summary, 2 different HIV-1 epidemics converge in Guangxi province in China. In the centrally located city of Binyang, the CRF08_BC epidemic is similar to that described in the Yunnan province. This epidemic is characterized by a lack of variation in V3 and epidemiologic data that support dissemination from a single source through parenteral transmission. In the southern city of Pingxiang, the CRF01_AE epidemic is spread parenterally and sexually. Three recently infected individuals harboring the CRF08_BC subtype and a chronically infected individual infected with a novel recombinant AE-BC strain suggest that the CRF08_BC subtype has been introduced into the drug-using population of Pingxiang.
1. Sun X, Nan J, Guo Q. AIDS and HIV infection in China
. 1994;8(Suppl 2):S55-S59.
2. Zheng X, Tian C, Choi KH, et al. Injecting drug use and HIV infection in southwest China
. 1994;8(Part 1):1141-1147.
3. Yu ES, Xie Q, Zhang K, et al. HIV infection and AIDS in China
, 1985 through 1994. Am J Public Health
4. Yu XF, Chen J, Shao Y, et al. Two subtypes of HIV-1
among injection-drug users in southern China
5. Yu XF, Chen J, Shao Y, et al. Emerging HIV infections with distinct subtypes of HIV-1
infection among injection drug users from geographically separate locations in Guangxi Province, China
. J Acquir Immune Defic Syndr
6. Yu XF, Liu W, Chen J, et al. Rapid dissemination of a novel B/C recombinant HIV-1
among injection drug users in southern China
7. Beyrer C, Razak MH, Lisam K, et al. Overland heroin trafficking routes and HIV-1
spread in south and south-east Asia. AIDS
8. Quan VM, Chung A, Long HT, et al. HIV in Vietnam: the evolving epidemic and the prevention response, 1996 through 1999. J Acquir Immune Defic Syndr
9. Kato K, Shiino T, Kusagawa S, et al. Genetic similarity of HIV type 1 subtype E in a recent outbreak among injecting drug users in northern Vietnam to strains in Guangxi Province of southern China
. AIDS Res Hum Retroviruses
10. Kato K, Kusagawa S, Motomura K, et al. Closely related HIV-1 CRF01_AE
variant among injecting drug users in northern Vietnam: evidence of HIV spread across the Vietnam-China
border. AIDS Res Hum Retroviruses
11. Yu XF, Liu W, Chen J, et al. Maintaining low HIV type 1 env genetic diversity among injection drug users infected with a B/C recombinant and CRF01_AE
HIV type 1 in southern China
. AIDS Res Hum Retroviruses
12. Piyasirisilp S, McCutchan FE, Carr JK, et al. A recent outbreak of human immunodeficiency virus type 1 infection in southern China
was initiated by two highly homogeneous, geographically separated strains, circulating recombinant form AE and a novel BC recombinant. J Virol
13. Michael NL, Herman SA, Kwok S, et al. Development of calibrated viral load standards for group M subtypes of human immunodeficiency virus type 1 and performance of an improved AMPLICOR HIV-1
monitor test with isolates of diverse subtypes. J Clin Microbiol
14. Hall T. BioEdit: a user-friendly biological sequence alignment editor and analysis program for window96/98/NT. Nucleic Acids Symp Ser
15. Felsenstein J. PHYLIP: phylogeny inference package (version 3.2). Cladistics
16. Carr JK, Salminen MO, Koch C, et al. Full-length sequence and mosaic structure of human immunodeficiency virus type 1 isolate from Thailand. J Virol
17. Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution
18. Ray SC, Wang Y-M, Laeyendecker O, et al. Acute hepatitis C virus structural gene sequences as predictors of persistent viremia: hypervariable region 1 as a decoy. J Virol
19. Nei M, Gojobori T. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol
20. Lole KS, Bollinger RC, Paranjape RS, et al. Full-length human immunodeficiency virus type 1 genomes from subtype C-infected seroconverters in India, with evidence of intersubtype recombination. J Virol
21. Tripathy S, Renjifo B, Wang WK, et al. Envelope glycoprotein 120 sequences from primary HIV-1
isolates from Pune and New Delhi. India AIDS Res Hum Retroviruses
22. Shankarapa R, Chatterjee R, Learn GH, et al. Human immunodeficiency virus type 1 env sequences from Calcutta in eastern India: identification of features that distinguish subtype C sequences from India from other subtype C sequences. J Virol
23. Shiao J, Guo L, McLaws M-L. Estimation of the risk of bloodborne pathogens to health care workers after a needlestick injury in Taiwan. Am J Infect Control
24. Thomas DL, Astemborski J, Rai RM, et al. The natural history of hepatitis C virus infection: host, viral, and environmental factors. JAMA
25. Goudsmit J, Lukashov VV, Van Ameijden EJC, et al. Impact of sexual versus parenteral transmission events on the evolution of gag and env genes of HIV type 1. AIDS Res Hum Retroviruses