The genetic diversity of HIV-1 is extensive, as a result of an error-prone reverse transcriptase and recombination between the two copies of the RNA genome packaged in the virion . Worldwide, HIV-1 can be classified into at least 10 genetic subtypes . Recombination between subtypes has led to the generation of many circulating recombinant forms (CRF), which are recombinants of defined structure spreading in populations. Already 11 different CRF have been identified . The complexity of the HIV-1 epidemic, with 10 subtypes and at least 11 CRF, is evident.
The HIV-1 epidemic in South America has been concentrated largely in high-risk groups, including injecting drug users (IDU) and homosexual men. The distribution of HIV-1 genetic subtypes in South America has been thought to resemble that of North America, with a predominance of subtype B [4,5]. Unlike North America, however, a small proportion of subtype F has been consistently found in Brazil and Argentina since the 1980s [2,4]. One full genome of subtype F from Brazil has been sequenced (93BR020) , but other samples reported as subtype F were characterized in only one segment of envelope .
This report, the first to apply full genome sequencing to a large number of strains from South America, suggests that most of the viruses thought to be subtype F in South America are actually recombinants between subtypes F and B. The structure and distribution of BF recombinants is complex and does not resemble patterns seen in any other region of the world. Understanding the distinctive molecular epidemiology of HIV-1 in South America will be critical to the implementation of a sound vaccine strategy.
HIV-seropositive subjects were sampled after informed consent in the conduct of four different studies. Twelve were women or their sexual partners in Buenos Aires, Argentina, from a study described elsewhere (Avila et al., in preparation); one was an adult woman from the northern province of Misiones. Three were vertically infected children in Buenos Aires (Gomez-Carrillo et al., in preparation), and five were samples from on-going surveillance in Bolivia and Uruguay . Peripheral blood mononuclear cells (PBMC) were separated by Ficoll–Hypaque and maintained at −70°C. PBMC from seropositive participants were used for DNA extraction by the QIAmp DNA extraction kit (QIAgen, Valencia, CA, USA).
Most DNA samples were first genotyped using the envelope heteroduplex mobility assay (HMA) as described . Ten HMA subtype F, seven HMA subtype B, and four unscreened samples were subjected to virtually full genome amplification by nested polymerase chain reaction as described [8,9]. The amplicon was sequenced in the Applied Biosystems 3100 automated sequencer using Big Dye terminators (Applied Biosystems Inc., Foster City, CA, USA). To complete the full genome sequence, a small amplicon containing the long-term repeat (LTR) and the gag leader region was amplified separately by nested polymerase chain reaction using LTR1 (5’–CACACAAGGCTAYTTCC CTGA–3’) and JL17 (5’–CATTCTGCAGCTTC CTCATTGAT–3’) for the first round, and LTR2 (5’–TCCYCYTGGCCTTAACCGAAT–3’) and LTR3 (5’–TGGATGGTGCTWCAAGYTAGT–3’) for the second round. The cycling conditions for both rounds were 94°C for 12 min, then 35 cycles of 94°C for 45 s, 55°C for 30 s and 72°C for 2.5 min, followed by 72°C for 10 min. The full genome sequence was assembled from the 9 kb amplicon and the LTR amplicon if the overlapping regions matched (genetic distance < 5%).
A multiple alignment of the newly derived full genome sequences with selected reference sequences was constructed . Phylogenetic trees were generated and the consistency of branching order was evaluated using SEQBOOT, DNADIST, NEIGHBOR, CONSENSE and DNAPARS modules of the Phylip Package (V3.52c)  and TREETOOL .
Distance scanning, bootscanning, and visual inspection of the alignment were used to determine the presence of recombination and to locate breakpoints [13,14]. After identification of the breakpoints, each segment was extracted and subjected to phylogenetic analysis to confirm the assignment of subtype. A bootstrap value joining the query sequences with a particular subtype was considered to be significant if it exceeded 70%. Recombinant breakpoint locations were designated relative to HXB-2 (Genbank accesson no. K03455).
Nucleotide sequence accession numbers
Sequences were submitted to Genbank under the accession numbers listed in Table 1.
Information about the 21 individuals included in this study is presented in Table 1. The majority were from Argentina, were asymptomatic at the time of sampling, and were infected heterosexually. Three were children vertically infected either at birth or during breast-feeding. Three Uruguayans were infected homosexually. Two individuals were Bolivian. Samples ARMA 036 and ARMA037 were transmission linked.
Phylogenetic analysis of the virtually complete genomes of the 21 HIV-1 samples is shown in Fig. 1. Five samples clustered with subtype B (bootstrap 81%), and were confirmed to be non-recombinant by boot and distance scans: one was from Bolivia and four were from Argentina. The remaining 16 samples were BF recombinants. No subtype C samples were encountered in this sample set. Fig. 1 shows that the BF recombinants did not form a single genetic cluster as do other CRF (bold line); CRF01_AE, CRF02_AG and CRF03_AB illustrate this in Fig. 1. The many forms of BF recombinant were not joined by a significant bootstrap value. Because the majority of the genetic material in the recombinants derives from subtype F1, they are close to the F1 samples in the phylogenetic tree. The lengths of the branches in this neighbor-joining tree are proportional to the genetic distance, shown at the bottom of the figure, and it is clear that the genetic distances between the BF recombinants are quite large. Also shown is one full BF recombinant genome, BR029, collected in 1992 from Brazil , which is not closely related to any of the other BF recombinants.
An archive of samples from the past was not available, but it was possible to approximate a time series by studying viruses of children of varying ages who were infected at birth. Three children were available for study, born in 1984, 1986 and 1987, respectively. The child born in 1984, ARCH054, was infected with non-recombinant subtype B, but the children born in 1986 and 1987, ARCH014 and ARCH003, were both infected with BF recombinants. As seen in Fig. 1, they were genetically distinct from each other.
The detailed subtype structure of all of the virtually full-length genomes is shown in Fig. 2. The four samples with the same subtype structure at the top represent a new CRF, ‘CRF12_BF’, prototypic strain ARMA159. The next 12 samples were other BF recombinants found in Argentina, Uruguay and Bolivia. Whereas most had a structure related to the CRF, each had one or more additional segments from subtype B, as though they were back-crossed with subtype B. Each was unique, except for the two samples that were transmission-linked sexual partners, ARMA036 and ARMA037; these had identical structures.
The subtype structure of the new BF CRF was confirmed by phylogenetic analysis of the different segments of the genome (Fig. 3), using the full, two-LTR, proviral genome. The majority of the genome was subtype F1 but there were five segments of subtype B: (i) the gag leader and the beginning of the gag gene (HXB-2 nt 739-951); (ii) a short segment spanning the end of protease and the beginning of reverse transcriptase (RT) (HXB-2 nt 2473-2640); (iii) the region from the active site of the RT to its the carboxy terminus (HXB-2 nt 3026-3688); (iv) a segment defined by the beginning and end of the vpu gene (HXB-2 nt 5944-6209); and (v) a small segment located in the intracellular part of the gp41 protein (HXB-2 nt 8483-8660).
It is characteristic of most CRF that they are descended from one specific parent from each subtype. The subtype A regions of CRF02_AG form a separate cluster within the subtype A lineage, for example. This pattern is not seen in the phylogenetic trees for CRF12_BF; it neither forms a separate cluster as a full-length genome (Fig. 1) nor do the subregions of subtype B or F specifically cluster (Fig. 3).
HIV-1-seropositive samples collected in a variety of research projects in three South American countries were studied by phylogenetic analysis of the full HIV-1 genomes. Most of the 21 samples were recombinants between subtypes B and F. Non-recombinant subtype F viruses were not found, even though HMA was used to pre-select for subtype F in 10 out of 21 cases. BF recombinants were found in Argentina, Uruguay and Bolivia. Not all of the HIV-positive samples from the various studies mentioned were included in this full genome analysis, and so the proportion of BF recombinants in the sample set does not necessarily reflect the population-based prevalence. It demonstrates, however, that BF recombinants are not uncommon in the regions studied.
The BF recombinants present a surprisingly diverse collection of genetic structures. One particular structure, found in four samples, two from Argentina and two from Uruguay, has been designated CRF 12_BFARMA159 as a result of this study. It is largely subtype F, with segments of subtype B in gag, pol, vpu and env. Unlike CRF in other geographical locations, however, CRF12_BF does not represent the majority of the BF recombinants found. The remaining 12 BF recombinants were similar to the CRF in some recombinant breakpoints, but had additional subtype B segments. They may have arisen by different recombination events between the CRF and subtype B in various individuals, generating a large population of unique BF recombinants. In Asia, where CRF01_AE and subtype B co-circulate  or in West Africa, where CRF02_AG is the predominant form , a variety of other recombinants of similar structure related to the ‘main’ CRF have not been seen.
Assessing the epidemic history of these BF recombinants in South America is challenging because of the lack of archival samples. Two BF recombinants found in vertically infected children, ARCH014 and ARCH 003, neither of which were CRF12_BF, suggest the circulation of these forms since the mid-1980s. The genetic diversity between samples also suggests an early spread of the recombinants. In general, the longer the period of time since a common ancestor, the greater is the genetic distance. Fig. 3 shows that the diversity between the CRF12_BF recombinants within each region of the genome is similar to the diversity within the other subtypes. This suggests that the time since a common CRF12_BF ancestor is not dramatically different from that of the other subtypes. On the basis of these two lines of evidence, it is unlikely that BF recombinants are of recent origin in South America.
The HIV-1 epidemic in several other locations is dominated by CRF. In Southeast Asia, the most common form of HIV-1 is CRF01_AE . In west and west central Africa, CRF02_AG is the most common genetic form [18,19]. Several IDU populations have been found to have CRF that are quite distinctive for each drug-using network [20,21]. Each of these IDU CRF is of recent origin and has not yet spread to the general population. The molecular epidemiology of CRF12_BF fits none of these patterns. It is not the most common genetic form in circulation in this study, although the other BF recombinant forms are apparently related to it. CRF12_BF is not of recent origin and is not restricted to any one risk group, but may represent the most common BF recombinant in South America. Any of the other ‘unique’ BF recombinants may also be circulating at some level, but were not detected as a CRF because of limited sampling.
HIV-1 molecular epidemiology in South America was once thought to be simple: the majority form was subtype B, with approximately 10% subtype F and a smattering of subtype C [4,5]. Full genome sequencing and analysis have shown that this conceptualization was incorrect. It is likely that most samples characterized as subtype F by HMA or partial sequencing were actually BF recombinants; pure subtype F is apparently rare. Several lines of evidence suggest that BF recombinants have been in wide circulation since the 1980s, and will constitute a significant proportion of the incident strains during future vaccine trials. It will become increasingly valuable to know, in full molecular detail, the circulating strains of HIV-1 in South America. The interpretation of vaccine coverage against different subtypes depends critically on this information.
1. Coffin JM. The virology of AIDS: 1990. AIDS 1990, 4 (Suppl. 1) : S1–S8.
2. McCutchan FE. Global diversity in human immunodeficiency viruses.
In:Molecular evolution of HIV
. Crandall KA (editor). Baltimore: The Johns Hopkins University Press; 1999.
3. HIV Sequence Database Web Site. Theoretical biology and biophysics.
Los Alamos National Laboratory, http://hiv-web.lanl
. gov. Last accessed April 2001.
4. Morgado MG, Sabino EC, Shpaer E. et al
. V3 region polymorphisms in HIV-1 from Brazil: prevalence of subtype B strains divergent from North American/European prototype and detection of subtype F. AIDS Res Hum Retroviruses 1994, 10: 569–576.
5. Russell KL, Carcamo C, Negrete M. et al
. Emerging genetic diversity of HIV-1 in South America
. AIDS 2000, 14: 1785–1791.
6. Gao F, Robertson DL, Carruthers CD. et al
. A comprehensive panel of near-full-length clones and reference sequences for non-subtype B isolates of human immunodeficiency virus type 1. J Virol 1998, 72: 5680–5698.
7. Delwart EL, Shpaer EG, Louwagie J. et al
. Genetic relationships determined by a DNA heteroduplex mobility assay: analysis of HIV-1 env
genes. Science 1993, 262: 1257–1261.
8. Salminen MO, Koch C, Sanders-Buell E. et al
. Recovery of virtually full-length HIV-1 provirus of diverse subtypes from primary virus cultures using the polymerase chain reaction. Virology 1995, 213: 80–86.
9. Carr JK, Laukkanen T, Salminen M. et al
. Characterization of subtype A HIV-1 from Africa by full genome sequencing. AIDS 1999, 13: 1819–1826.
10. Carr JK, Foley BT, Leitner T, Salminen M, Korber B, McCutchan F. Reference sequences representing the principal genetic diversity of HIV-1 in the pandemic.
In:Human retroviruses and AIDS
. Korber B, Kuiken CL, Foley B, et al
. (editors). Los Alamos, NM: Theoretical Biology and Biophysics Group; 1998.
11. Felsenstein J. phylip – phylogenetic inference package (version 3.2). Clad 1989, 5: 164–166.
12. Maciukenas S. Treetool.
Ribosomal Database Project, University of Illinois; 1994.
13. Carr JK, Salminen MO, Koch C. et al
. Full-length sequence and mosaic structure of a human immunodeficiency virus type 1 isolate from Thailand. J Virol 1996, 70: 5935–5943.
14. Salminen MO, Carr JK, Burke DS, McCutchan FE. Identification of breakpoints in intergenotypic recombinants of HIV type 1 by bootscanning. AIDS Res Hum Retroviruses 1995, 11: 1423–1425.
15. McCutchan FE, Hegerich PA, Brennan TP. et al
. Genetic variants of HIV-1 in Thailand. AIDS Res Hum Retroviruses 1992, 8: 1887–1895.
16. Ellenberger DL, Pieniazek D, Nkengasong J. et al
. Genetic analysis of human immunodeficiency virus in Abidjan, Ivory Coast reveals predominance of HIV type 1 subtype A and introduction of subtype G. AIDS Res Hum Retroviruses 1999, 15: 3–9.
17. McCutchan FE, Viputtigul K, de Souza MS. et al
. Diversity of envelope glycoprotein from human immunodeficiency virus type 1 of recent seroconverters in Thailand. AIDS Res Hum Retroviruses 2000, 16: 801–805.
18. Nkengasong JN, Luo CC, Abouya L. et al
. Distribution of HIV-1 subtypes among HIV-seropositive patients in the interior of Cote d’Ivoire. J Acquir Immune Defic Syndr 2000, 23: 430–436.
19. Carr JK, Wolfe ND, Eitel M. et al
. The AG recombinant IbNG and novel strains of group M HIV-1 are common in Cameroon. Virology 2001, 286: 168–181.
20. Liitsola K, Tashkinova I, Laukkanen T. et al
. HIV-1 genetic subtype A/B recombinant strain causing an explosive epidemic in injecting drug users in Kaliningrad. AIDS 1998, 12: 1907–1919.
21. Piyasirisilp S, McCutchan FE, Carr JK. et al
. A recent outbreak of human immunodeficiency virus type 1 infection in Southern China was initiated by two highly homogeneous, geographically separated strains, circulating recombinant form AE and a novel BC recombinant. J Virol 2000, 74: 11286–11295.