HIV-1 exhibits an extraordinarily high genetic diversity, which derives from high mutation, recombination, and turnover rates. By means of these mechanisms, viruses of HIV-1 group M (which is responsible for the global pandemic) have diversified extensively into numerous clades since their introduction from a chimpanzee reservoir in central Africa.1 The extensive genetic diversity of HIV-1 represents one of the greatest hurdles to the development of effective vaccines and therapies. According to phylogenetic analyses of full-length genomes, HIV-1 group M isolates are classified in 9 subtypes and numerous recombinant forms. The latter are generated in individuals infected with ≥2 viruses of different subtypes. Recombinant forms that have spread epidemically are known as circulating recombinant forms (CRFs),2 of which at least 18 are currently recognized.3,4 To define an HIV-1 CRF, at least 3 epidemiologically unlinked viruses with identical mosaic structures must be characterized, at least 2 of them in near full-length genomes (>8 kb).2 In addition to CRFs, numerous recombinant forms detected in a single individual or a single epidemiologically-linked cluster have been identified. These are known as unique recombinant forms (URFs) and are common in areas where ≥2 HIV-1 genetic forms cocirculate in the same population.5 Genetic characterization of HIV-1 clades allows us to examine their relationship to virus biology, immune responses, and resistance to antiretroviral drugs; to track the origin and spread of HIV-1 variants into different geographic areas and populations; to analyze interclade recombination; and to detect dual infections, including superinfections.
We have detected an unusually high diversity of HIV-1 genetic forms in Cuba, among them several novel recombinant forms, 2 of which formed well-supported clusters in phylogenetic trees of partial sequences.6 One of them, present in 7% of samples, was recently identified as a novel complex CRF (CRF18_cpx) of central African origin by analysis of near full-length genomes4; the other, present in 21% of samples, was of subtype A in env and of subtype D in pol.6 Here we analyze near full-length genome sequences of 4 of these recombinants, to determine whether they represent a CRF, and to identify related viruses from other countries. The results indicate that 3 of the analyzed recombinants indeed represent a novel CRF (CRF19_cpx), whereas the 4th is a unique inter-CRF mosaic virus of Cuban origin (CRF18/CRF19). Putative parental strains of the newly defined CRF were also identified in central Africa.
MATERIALS AND METHODS
Peripheral blood mononuclear cell lysates from 4 Cuban HIV-1-infected individuals harboring Dpol/Aenv recombinant viruses (CU7, CU29, CU38, and CU64) were used for polymerase chain reaction (PCR) amplification. Two of the subjects were homosexual or bisexual men, and 2 were women infected via heterosexual contact. All had acquired HIV-1 in Cuba, and there were no known epidemiologic links between them. Data on the subjects are shown in Table 1.
PCR Amplification, Sequencing, and Phylogenetic Analysis
Near full-length genome amplification was done by nested PCR in 4 overlapping segments and subsequent direct sequencing of the amplified products, as previously described.7,8 Sequence electropherograms were assembled with Seqman (DNASTAR, Madison, WI). Alignments with HIV-1 subtype reference sequences were done with Clustal X9 and were manually edited with Bioedit (Tom Hall, http://www.mbio.ncsu.edu/BioEdit/bioedit.html), taking in consideration the predicted amino acid sequences. Analyses of the recombinant structures and of the phylogenetic relationships of the recombinants were done by bootscanning10 with Simplot 3.2 beta.11 In this analysis, the bootstrap values supporting the phylogenetic relationship of the recombinants with reference viruses within a window sliding along the sequence alignment are plotted against the nucleotide position of the window's midpoint in the genome. The subtype assignment of each possible recombinant segment supported by bootscan analysis was also examined by neighbor-joining trees, based on Kimura's 2-parameter distances, constructed with MEGA 3.12 Bootstrap values ≥70% were considered definitive for subtype classification.13 To identify viruses from other countries related to the analyzed Cuban recombinants, Basic Local Alignment Search Tool (BLAST) searches14 with partial genome segments were performed using the software available at the Los Alamos HIV Sequence Database Web page.3 The relationship of the database sequences with the Cuban recombinants was subsequently analyzed with neighbor-joining phylogenetic trees constructed with MEGA 3.
The newly derived sequences are deposited in Genbank under accession nos. AY588970, AY588971, AY894994, and AY894995.
RESULTS AND DISCUSSION
The bootscan analyses showed that the genomes of 3 of the analyzed recombinants (CU7, CU29, and CU38) comprised segments clustering alternatively with references of subtypes A, D, and G (Fig. 1) and exhibited virtually coincident recombinant structures and uniform clustering with each other along the genome. Subtype assignments for each segment derived from the bootscan analyses were confirmed with neighbor-joining trees, in each case, with a bootstrap value of ≥70% supporting their relationship with subtype references (Fig. 1). In 2 short segments, in the 5′ portion of vif and in the cytoplasmic domain of gp41, respectively (Fig. 1, partitions V and IX), in which subtype affiliation of the recombinants was unresolved by bootscan analysis, references of subtypes G and A grouped in a multisubtype cluster, within which subtype G references formed a subcluster; the Cuban recombinants branched in the AG clusters apart of the subtype G subcluster, and therefore it was assumed that they were of subtype A in these segments. In subtype A segments in which sub-subtype references formed well-supported subclusters, the Cuban recombinants did not branch within the sub-subtype A2 or A3 subclusters; in 2 of the partitions (III and VII), they branched in a cluster formed by viruses of sub-subtypes A1 and A3. The trees of each segment also confirmed the common ancestry of CU7, CU29, and CU38, which formed a subcluster apart from the subtype references. These results indicate that the 3 viruses represent the same recombinant form, derived from subtypes A, D, and G, allowing us to define a new CRF, designated CRF19_cpx, a name that indicates the order of discovery and the complex recombinant structure involving >2 parental subtypes.2 The recombinant structure of CRF19_cpx inferred from bootscan analysis and neighbor-joining trees is shown in the top of Figure 1. The subtype D segments comprise most of gag (except p17); the portion of pol coding for p6pol, protease, and reverse transcriptase; and nef. Five segments are of subtype A, corresponding to p17gag; the 5′ half of integrase; the 5′ portion of vif; most of gp120, and the external portion of gp41; and a short segment in the cytoplasmic domain of gp41. Three segments are of subtype G: the 3′ half of integrase; the midgenome region comprising a portion of vif, vpr, the first coding exons of tat and rev, and vpu, and the 5′ end of env; and the transmembrane and the 5′ half of the cytoplasmic domain of gp41.
The 4th analyzed virus, CU64, originally characterized as Dpol/Aenv, when examined by bootscanning in the near full-length genome, comprised multiple segments that clustered alternatively with subtypes A, D, G, and H. When CRF18_cpx and CRF19_cpx viruses were included in the bootscan analysis, together with the subtype references, CU64 clustered alternatively along the genome only with CRF18_cpx and CRF19_cpx (Fig. 2). The relationship of CU64 with CRF18_cpx and CRF19_cpx was confirmed in phylogenetic trees of each genome segment, defined according to the results of the bootscan analysis (Fig. 2). This indicates that CU64 was generated by secondary recombination between viruses of both CRFs. The finding of this inter-CRF mosaic virus is a further indication of the circulation of CRF18_cpx and CRF19_cpx in Cuba. In a previous study in Cuba, in which segments of pol and env, representing approximately 1.5 kb of the genome, were analyzed, we detected URFs of Cuban origin in 11 of 105 samples.6 Of these, 3 are CRF18/CRF19 recombinants, 2 CRF19/B, 1 CRF19/G, 3 CRF18/B, and 2 B/G. The CRF18/CRF19 recombinant here identified was not among those detected in the previous study, because all originally analyzed segments derive from CRF19_cpx. This suggests that the 10% proportion of URFs of local origin previously reported in Cuba could increase substantially with analysis of full-length genomes. Thus, Cuba can be defined as another HIV-1 geographic “recombination hotspot,” similar to other areas, such as central Myanmar,15 Yunnan province of China,16 Argentina,17 Brazil,18 or east Africa.19,20
In phylogenetic trees of partial segments, the subtype G portions of CRF19_cpx failed to form a specific cluster with subtype G viruses identified in Cuba, and the subtype D pol segment failed to cluster with the only subtype D virus identified by us in Cuba (results not shown). This and the lack of evidence of circulation of subtype A in Cuba support the non-Cuban origin of CRF19_cpx, which probably originated in sub-Saharan Africa, where subtypes A, D, and G are circulating.
To identify non-Cuban viruses related to CRF19_cpx, we searched for sequences deposited in the Los Alamos HIV Sequence Database having high BLAST similarity scores with CRF19_cpx, with subsequent analysis with phylogenetic trees. We found 6 viruses from sub-Saharan Africa related in partial segments to CRF19_cpx, 2 from Cameroon, 2 from Gabon, and 1 each from Senegal, Chad, and Niger. One of the Cameroonian viruses, CM53392, an AG intersubtype recombinant virus characterized in the near full-length genome,21 showed sequence homology to CRF19_cpx along approximately 5 kb of the genome. In this segment, the bootscan analysis of CM53392 identified 4 intersubtype breakpoints mapped to sites coincident with those of CRF19_cpx (Fig. 3A). In each of 4 segments of uniform subtype delimited by intersubtype breakpoints, CRF19_cpx and CM53392 clustered in phylogenetic trees with 79%-100% bootstrap support (Fig. 3A). G109, a subtype D virus from Gabon characterized in gag and env,22,23 clustered with the subtype D gag segment of CRF19_cpx (approximately 1 kb) with 99% bootstrap support (Fig. 3B). VI525, an AGH recombinant virus from Gabon, characterized in the full-length gag and env genes,22,24 clustered in the env coding region with CRF19_cpx with 100% bootstrap value (Fig. 3C). Four other central or west African viruses (98CM_K1153 from Cameroon, 99TCD_MN002 from Chad, 00NE013 from Niger, and 97SE-1181 from Senegal) formed a cluster in the env V3 region with CRF19_cpx, CN53392, and VI525, supported by 92% bootstrap value (Fig. 3D). G109 is not a CRF19_cpx virus, because it is of subtype D in the 0.9 kb available env sequence (where CRF19_cpx is of subtype A), and VI525 differs from CM53392 and CRF19_cpx, because it is of subtype H in gag. The results therefore suggest that G109 (subtype D) and CM53392 (AG recombinant) may represent the parental strains from which the ancestor of CRF19_cpx originated. If this is the case, then VI525 may have originated by recombination between a CM53392-like AG recombinant and a subtype H virus, pointing to the possible existence of a yet-unrecognized AG recombinant CRF in central Africa, represented by CM53392.
With the identification of CRF19_cpx, the number of genetic forms currently reported to be circulating in Cuba is 4, the other 3 being subtypes B and G and CRF18_cpx.4,6 Our preliminary data obtained from the analysis of recent samples indicate that, in addition to these 4, there are currently at least 4 other HIV-1 genetic forms circulating in Cuba, excluding infections acquired in Africa (unpublished data). The high HIV-1 genetic diversity in Cuba presumably originates from the presence during several decades of large numbers of Cubans in various sub-Saharan African countries.25 HIV-1 infections in Cuba directly linked to Africa represented 25% of cases in 1990.26
In conclusion, the results of this study allow us to identify a novel CRF of central African ancestry in Cuba and further confirm the unusually high genetic diversity of HIV-1 in this country, which derives from the introduction of multiple African genetic forms and from the generation of mosaic viruses by recombination between locally circulating variants. It also shows the high recombinogenic potential of HIV-1, visible in areas of high HIV-1 diversity of cocirculating clades, such as central Africa or Cuba, which can lead to the generation by successive rounds of recombination of highly diverse genetic forms, some of which can become circulating through the acquisition of novel biologic properties or through their introduction in certain transmission networks. The genetic characterization of HIV-1 variants is only an initial but necessary step for multiple studies, including their relationship to virus biology, immune responses,27-29 validity of tests used in clinical practice,30-32 and resistance to antiretroviral drugs.33,34
The authors thank Ignacio Ruibal for providing clinical samples and Alicia Ballester, Pablo Martínez, and Aurora de Miguel, at the Genomic Unit of the Centro Nacional de Microbiología, Instituto de Salud Carlos III, for technical assistance.
1. Gao F, Bailes E, Robertson DL, et al. Origin of HIV-1
in the chimpanzee Pan troglodytes troglodytes. Nature
2. Robertson DL, Anderson JP, Bradac JA, et al. HIV-1
nomenclature proposal. Science
3. Sequence Database HIV. [database online]. Los Alamos, NM: Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, 2004. Available at: http://www.hiv.lanl.gov
, accessed October 6, 2005.
4. Thomson MM, Casado G, Sierra M, et al. Identification of novel complex circulating recombinant form
(CRF18_cpx) of Central African origin in Cuba
5. Nájera R, Delgado E, Pérez-Álvarez L, et al. Genetic recombination and its role in the development of the HIV-1
. 2002;16(Suppl 4):S3-S16.
6. Cuevas MT, Ruibal I, Villahermosa ML, et al. High HIV-1
genetic diversity in Cuba
7. Thomson MM, Delgado E, Herrero I, et al. Diversity of mosaic structures and common ancestry of human immunodeficiency virus type 1 BF intersubtype recombinant viruses from Argentina revealed by analysis of near full-length genome sequences. J Gen Virol
8. Sierra M, Thomson MM, Ríos M, et al. The analysis of near full-length genome sequences of human immunodeficiency virus type 1 BF intersubtype recombinant viruses from Chile, Venezuela and Spain reveals their relationship to diverse lineages of recombinant viruses related to CRF12_BF. Infect Genet Evol
9. Thompson JD, Gibson TJ, Plewniak F, et al. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res
10. Salminen MO, Carr JK, Burke DS, et al. Identification of breakpoints in intergenotypic recombinants of HIV type 1 by bootscanning. AIDS Res Hum Retroviruses
11. Lole KS, Bollinger RC, Paranjape RS, et al. Full-length human immunodeficiency virus type 1 genomes from subtype C-infected seroconverters in India, with evidence of intersubtype recombination. J Virol
12. Kumar S, Tamura K, Nei M. MEGA3: integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment. Brief Bioinform
13. Hillis DM, Bull JJ. An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst Biol
14. Altschul SF, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol
15. Takebe Y, Motomura K, Tatsumi M, et al. High prevalence of diverse forms of HIV-1
intersubtype recombinants in Central Myanmar: geographical hot spot of extensive recombination. AIDS
16. Yang R, Xia X, Kusagawa S, et al. On-going generation of multiple forms of HIV-1
intersubtype recombinants in the Yunnan Province of China. AIDS
17. Thomson MM, Delgado E, Herrero I, et al. Diversity of mosaic structures and common ancestry of human immunodeficiency virus type 1 BF intersubtype recombinant viruses from Argentina revealed by analysis of near full-length genome sequences. J Gen Virol
18. Thomson MM, Sierra M, Tanuri A, et al. The analysis of near full-length genome sequences of HIV type 1 BF intersubtype recombinant viruses from Brazil reveals their independent origins and their lack of relationship to CRF12_BF. AIDS Res Hum Retroviruses
19. Dowling WE, Kim B, Mason CJ, et al. Forty-one near full-length HIV-1
sequences from Kenya reveal an epidemic of subtype A and A-containing recombinants. AIDS
20. Harris ME, Serwadda D, Sewankambo N, et al. Among 46 near full length HIV type 1 genome sequences from Rakai District, Uganda, subtype D and AD recombinants predominate. AIDS Res Hum Retroviruses
21. Carr JK, Torimiro JN, Wolfe ND, et al. The AG recombinant IbNG and novel strains of group M HIV-1
are common in Cameroon. Virology
22. Louwagie J, McCutchan FE, Peeters M, et al. Phylogenetic analysis of gag genes from 70 international HIV-1
isolates provides evidence for multiple genotypes. AIDS
23. Delaporte E, Janssens W, Peeters M, et al. 1996. Epidemiological and molecular characteristics of HIV infection in Gabon, 1986-1994. AIDS
24. Janssens W, Heyndrickx L, Fransen K, et al. Genetic and phylogenetic analysis of env subtypes
G and H in central Africa
. AIDS Res Hum Retroviruses
25. Torres-Anjel MJ. Macroepidemiology of the HIVs-AIDS (HAIDS) pandemic: insufficiently considered zoological and geopolitical aspects. Ann NY Acad Sci
26. Santana S, Faas L, Wald K. Human immunodeficiency virus in Cuba
: the public health response of a Third World country. Int J Health Serv
27. Mascola JR, Louwagie J, McCutchan FE, et al. Two antigenically distinct subtypes
of human immunodeficiency virus type 1: viral genotype predicts neutralization serotype. J Infect Dis
28. Thomson MM, Pérez-Álvarez L, Nájera R. Molecular epidemiology of HIV-1
genetic forms and its significance for vaccine development and therapy. Lancet Infect Dis
29. Binley JM, Wrin T, Korber B, et al. Comprehensive cross-clade neutralization analysis of a panel of anti-human immunodeficiency virus type 1 monoclonal antibodies. J Virol
30. Amendola A, Bordi L, Angeletti C, et al. Underevaluation of HIV-1
plasma viral load by a commercially available assay in a cluster of patients infected with HIV-1
A/G circulating recombinant form
(CRF02). J Acquir Immune Defic Syndr
31. Antunes R, Figueiredo S, Bartolo I, et al. Evaluation of the clinical sensitivities of three viral load assays with plasma samples from a pediatric population predominantly infected with human immunodeficiency virus type 1 subtype G and BG recombinant forms. J Clin Microbiol
32. Gottesman BS, Grosman Z, Lorber M, et al. Measurement of HIV RNA in patients infected by subtype C by assays optimized for subtype B results in an underestimation of the viral load. J Med Virol
33. Kantor R, Katzenstein D. Drug resistance in non-subtype B HIV-1
. J Clin Virol
34. Wainberg MA. HIV-1
subtype distribution and the problem of drug resistance. AIDS
. 2004;18(Suppl 3):S63-S68.