Drug-resistance testing is important in the follow-up of HIV-1-infected patients. It is recommended in cases of treatment failure and in some situations also for treatment-naive patients [1,2]. Algorithms for interpretation of genotypic drug resistance results have been mainly developed based on subtype B resistance-associated mutation patterns. Resistance development in non-B subtypes can be different in comparison with subtype B. For example, some mutation patterns to protease inhibitor (PI) resistance, rarely found in subtype B, occur more frequently in subtypes G, C and CRF01_AE [3–7]. In patients infected with subtype C viruses, efavirenz therapy selects for a new reverse transcriptase (RT) mutation, V106M, at a position related to resistance in subtype B, conferring cross-resistance to other non-nucleoside reverse transcriptase inhibitors . Kantor et al. analysed mutation frequencies in several non-B HIV-1 isolates from treatment-naive and treatment-experienced patients and identified five novel non-B subtype-specific, treatment-related positions: PR6, PR64 and RT102 for subtype C, PR15 for CRF02_AG, PR19 for subtype F, PR37 for subtype A and PR64 for CRF01_AE . Although several other reports describe different mutational patterns in subtype B and non-B viruses [9–11], codon 89 mutation was only reported in subtype G in a preliminary report by our group and, more recently, in subtype F by Calazans et al. [12,13]. Here, we report an extended analysis of the association of protease amino acid 89V/I with failure of a PI-containing regimen in some non-B subtypes.
Sample selection and sequencing
The 902 sequences selected for analysis were derived from patients submitted to resistance testing between 2001 and 2004 either for therapy failure or for baseline genotyping in drug-naive patients, in the Egas Moniz Hospital in Lisbon and the University Hospitals in Leuven. In addition, 52 pure subtype sequences from drug-naive patients were obtained from the Los Alamos Database (http://www.hiv-web.lanl.gov/) to increase our information on drug-naive sequences for the subtypes considered. The sequences from Egas Moniz Hospital were obtained by population sequencing using the ViroSeq 2.0 kit (Abbott Laboratories, Abbott Park, Illinois, USA) and the sequences from the University Hospitals in Leuven were obtained either by population sequencing using the ViroSeq 2.0 kit or using an in-house protocol , thus obtaining protease (AA 1-99) and partial RT (AA 1-335). Subtype and recombination assessment was determined as described elsewhere . Epidemiological and treatment history information were collected.
Finally, a phylogenetic tree, including 105 randomly selected subtype G Portuguese sequences and 61 worldwide isolated subtype G sequences (downloaded from the Los Alamos database), was built to verify whether the subtype G Portuguese epidemic was representative of the subtype G infections worldwide. The tree was built using the neighbour-joining method based on a Kimura two-parameter distance matrix. A total of 1000 bootstrap replicates were run to evaluate the reliability of each cluster.
Statistical analysis of the genotypic data
For each subtype separately, sequences were stratified in four groups according to treatment status and resistance profile: drug-naive, failing a non-PI regimen, failing a PI-containing regimen without major resistance mutations, and failing a PI-containing regimen with major resistance mutations (major mutations were as defined by Johnson et al., 2004 ). We identified the wild-type (WT) 89 codon in drug-naive patients for each of the analysed subtypes separately, and subsequently investigated the prevalence of mutations at this position with respect to the WT codon thus identified. Finally, we compared the results between the four strata, using chi-squared analysis (Fisher's exact test). Different Fisher's exact tests were performed for: (a) patients on therapy failure without PI-containing regimen; (b) patients on therapy failure with PI-containing regimen and no major PI mutations; and (c) patients on therapy failure with PI-containing regimen and major PI mutations. Each of these groups was considered separately and was always compared with the drug-naive group.
Bayesian network analysis of the genotypic associations with M89I/V
Bayesian networks have been recently proposed as an approach to analyse the linkage between treatment experience and different mutations in the HIV-1 genome . A Bayesian network describes a set of direct dependencies that together explain the observed correlations in the data . The Bayesian networks were used to suggest robust interactions of mutations 89I/V with PI experience or with mutations at other positions that are associated with PI treatment, to explore their potential role in resistance development and consequential therapy failure. Furthermore, a bootstrapping method was used to measure the reliability of each of the arcs present in the networks.
We used subtype G sequences from drug-naive patients and subtype G sequences for which PI treatment experience was (a) only nelfinavir; (b) only indinavir; and (c) any, but only one PI. A first set of Bayesian networks were built as described elsewhere  from data sets including those mutations that were found to be associated with treatment using a Fisher's exact test, correcting for multiple comparisons using a Benjamini and Hochberg method with false discovery rate of 0.05. The most probable Bayesian network was obtained using a simulated annealing heuristic to search in the space of all possible Bayesian network structures. Using the same methodology, a fourth Bayesian network using only genotypic information was built to confirm the direct interactions of position 89 with PI-associated mutations. This network included all mutations that were associated with M89I/V, again with Fisher's exact test and Benjamini and Hochberg correction, and the third data set form as described above (treatment experience of only one PI). In all these networks we investigated the connectivity around position 89 and rapport-common robust features, with bootstrap support over 65%. This analysis was only done for subtype G, of which most data was available.
Sequences with different combinations of drug-resistance mutations, with and without the M89I/V mutation, were selected for phenotypic testing (n = 96). These sequences belonged to different non-B subtypes (subtype G: n = 55, subtype F: n = 9, CRF02_AG: n = 18, subtype C: n = 9, URFs: n = 5). The samples were phenotyped using the Antivirogram assay (Virco BVBA, mechelen, Belgium; Virco Lab, Inc., Durham, North Carolina, USA). The phenotypic results were analysed with multiple regression, for an indication of the association (or not) of 89I/V with reduced drug susceptibility. The use of a generalized linear model with a gamma distribution and a logarithmic link function provided a good model fit for atazanavir and saquinavir; however, for the remaining drugs none of the generalized regression models tested fitted the data, due to the lack of normality of the residuals (the difference between the observed response and the response predicted in the linear regression model) of the regression model when tested using the Shapiro–Wilks test (P < 0.05) (even after a log10 transformation of the data).
Finally, comparisons of the median phenotype value of groups of sequences with different mutations were made for: (a) WT versus M89I/V only; (b) L90M only versus L90M+M89I/V. The statistical test used was the Kruskal–Wallis test, a non-parametric test that does not assume a normal distribution of the data. The complete statistical analysis was done using the R software package .
Description of the study population
The analysed dataset included 848 sequences derived from patients tested in the Hospital de Egas Moniz and 54 sequences from patients tested at the University Hospitals of Leuven, either for reasons of therapy failure or for baseline genotyping in drug-naive patients, as well as 52 pure subtype sequences from drug-naive patients, downloaded from the Los Alamos Database. The subtype distribution was as follows: A (n = 23), B (n = 438), C (n = 69), D (n = 15), F (n = 26), G (n = 341), H (n = 19), J (n = 10), and K (n = 13). The median age of the studied population was 36 years (maximum = 74; minimum = 0). A total of 38.6% of the population was born in Europe, 7.6% in Africa, 0.5% in South America and 0.5% in Asia. For 52.8% of the population, this information was not available. The male/female ratio was 2.06. Median viral load was 4.26 log10 copies/ml plasma (minimum = 2.33 log10 copies/ml; maximum = 6.29 log10 copies/ml) and the median CD4 cell count was 342 cells/μl blood (minimum = 1; maximum = 2520). PI experience at the time of genotyping was none (14.0%), indinavir (30.9%), ritonavir (13.7%), saquinavir (12.0%), nelfinavir (24.1%), amprenavir (1.7%), lopinavir/ritonavir (3.4%) and atazanavir (0.2%).
The phylogenetic tree analyses showed that the Portuguese subtype G sequences form several clusters spread among other clusters from different countries indicating that the genetic diversity among the Portuguese subtype G strains is high enough to confirm that our results are applicable in general to subtype G (data not shown).
Association of mutations at codon 89 and therapy experience
In our dataset, including sequences from our database and reference sequences from the Los Alamos database (Table 1), leucine was the WT amino-acid at protease codon 89 in subtypes B and D. Methionine was WT in subtypes A and G (100 and 96.2% of the drug-naive patients, respectively). For subtypes C, F, J and K, methionine was predominantly observed in therapy-naive patients (≥ 60%), although leucine was also found. The WT codon for subtype H was isoleucine. Amino acids glycine, proline, serine and threonine were also seen in therapy-naive patients, although only rarely.
We compared frequencies of individual amino acids at codon 89 between samples from treatment-naive patients and from those who experienced antiretroviral therapy treatment failure. Amino acids isoleucine and valine were more frequent in patients failing a PI-containing regimen than in naive patients (Table 1). In subtype B, leucine was highly conserved and no difference was seen between the four categories. However in subtype G, the mutations M89I and M89V, and in subtypes C and F, the mutation M89I, occurred significantly more in patients failing a PI-containing regimen, with major PI resistance mutations, when compared with drug-naive patients (P < 0.01; P < 0.05, P < 0.0001, for subtypes C, F and G respectively). These mutations did not occur significantly more in patients on treatment with a non-PI-based regimen or in patients failing a PI-containing regimen but without major resistance mutations. Due to the low number of sequences from treated individuals analysed for subtypes A, D, H, J and K, no claims can be made for those subtypes.
Bayesian network analysis of genotypic associations with M89I/V
The data sets for Bayesian network learning included 128 sequences from PI-naive patients and 173 from PI-treated patients. The data sets included respectively 17, 8, and 13 variables corresponding to the presence of mutations that were found associated with nelfinavir, indinavir, or PI treatment in each data set. The prevalence of the 89I/V mutations in the data set was 4.7% in sequences from drug-naïve patients and 48.4% (nelfinavir), 38.2% (indinavir), and 41% (PI) in sequences from treated patients. For both nelfinavir and indinavir, the networks did not contain enough data to make firm claims on whether 89I/V was a primary or a secondary mutation, based on robust (with bootstrap > 65%) unconditional dependencies from treatment. When pooling all the data into a PI experience network, divergent selection for different primary mutations (depending on which PI the experience was) made the network unreliable to discover primary or secondary PI mutations. However, all three networks indicated robust unconditional dependencies between L90M and M89I/V, and two networks (for nelfinavir and PI) showed robust unconditional dependencies between M89I/V and mutations A71T/V and 74S. These interactions were confirmed in a network that was built from the PI data set and that included all mutations (and not just PI-treatment-associated mutations) that were found to be associated with M89I/V. The most probable network is shown in Fig. 1. The network indicated protagonistic interactions between M89I and PI-associated mutations 74S and 90M and PI-non-associated mutations 35G and 20T. At the same time, protagonistic interactions were found between M89V and PI-associated mutations 71T, 82T and PI-non-associated mutation 13A. Finally, an antagonistic interaction was found between 89M and PI-associated mutation L46I.
Interestingly, most of the residues indicated by the network (except codons 35, 46 and 82) are closely positioned to residue 89 in the protease three-dimensional (3D) structure (Fig. 2).
The multiple regression analysis, which had the objective of identifying the effect of M89I/V in the phenotype of the samples, was done using the glm function (from the R statistical package), with a gamma distribution and a logarithmic link function. For simplification, only the positions at which drug-resistance mutations existed in our dataset were included in the regression equation as predictor variables. For each resistance codon, values of 0 or 1 were attributed if a WT (with respect to the subtype of the analyzed sequence; for example, 89M is WT for subtype G) or mutant aminoacid, respectively, occurred at that position. The distribution of the residuals was normal only for atazanavir and saquinavir and only for saquinavir was the M89I/V mutation considered significant for the regression model (P = 2.954 × 10–12). In both cases, the value of the regression coefficient of M89I/V was negative, indicating a reduction of the fold change in 50% inhibitory concentration (FC) after the selection of this mutation. To have an idea of the effect of the mutation on the phenotypic results of the other drugs, we compared the FC of different groups: in patients with no other mutations (n = 51), the presence of M89I/V (n = 7) was always associated with a lower median phenotype value than in sequences without this mutation, except in the case of lopinavir. The Kruskal–Wallis test showed a statistically significant higher susceptibility to atazanavir in patients with M89I/V compared to without M89I/V (P = 0.04889), which is in accordance with the above-described regression results (Table 2). In samples with L90M (n = 9: n(L90M) = 6 and n(L90M+M89I/V) = 3), a reduction of the median FC in the presence of M89I/V was again observed for amprenavir, atazanavir, and indinavir; however, for nelfinavir, lopinavir and ritonavir the samples with L90M + M89I/V all had a higher median FC in comparison with samples with only L90M. This decreased susceptibility of samples with M89I/V in the presence of the L90M mutation was only statistically significant for nelfinavir (P = 0.03806). For saquinavir, the median did not change (Table 2).
These results reinforce our findings above and suggest a role of M89I/V mutation (in C, F and G subtypes) as a secondary mutation with a dramatic effect on susceptibility to nelfinavir when associated with L90M.
Our results lead to the conclusion that amino acids isoleucine and valine at protease codon 89 are selected under PI pressure in HIV-1 subtypes C, F and G and not in subtype B. However, in subtypes C, F and G, the mutation should be called M89V/I, rather than L89V/I, to respect the fact that methionine is the WT codon for those subtypes. We would like to stress that the observation of a linkage of a mutation at the PRO89 codon with PI therapy failure would have stayed unnoticed if one only discriminates between the consensus B WT and any other amino acid. The position would simply stick out as polymorphic. This may account for the failure to identify position 89 as a therapy-associated position in a recent study of mutations selected after therapy in a large database of non-B sequences . The reason for the 89I/V mutations in subtypes C, F and G seems to be a consequence of the WT 89M: there is a lower genetic barrier for methionine (AUG) to mutate to isoleucine (AUH) (three possible substitutions – one transition and two transversions – at the third codon position) or valine (GUN) (one transition possible at first codon position), than for leucine (CUN or UUR) to mutate to isoleucine or valine (one transversion possible at first codon position in both cases). Still, this cannot explain why 89M is substituted by 89I/V but 89L is invariable under treatment. We have evidence that, at least for resistance to nelfinavir, 89L is the most advantageous amino acid for the virus. When it is the WT, it supports both the 30N and 90M mutational pathways (manuscript in preparation). However, when for some reason (which might be the polymorphisms present on subtype G that were shown to be associated with M89I/V: Fig. 1) the protease does not tolerate a mutation to 89L, it selects 89I/V which causes a different mutational pathway under PI therapy, dominated by 90M, 71T and 74S, instead of by 30N. The conclusion of Kantor's study , that resistance development in non-B subtypes uses primarily the same mutations as for subtype B, is thus weakened, since the analysis was strongly biased against finding differences arising by mutation from different WT codons.
In addition, our results also suggest that this mutation is directly associated with other major PI mutations, suggesting a role as secondary or compensatory mutation. The associations were further analysed for subtype G, through Bayesian network analysis, which allow summarizing all dependencies observed in the data. Some of the strongest interactions of the M89I/V mutations were with well-known PI resistance mutations at PRO46, PRO71, PRO82 and PRO90. Once again, this suggests that M89I/V is a secondary mutation. Furthermore, the network suggested interactions with mutation 74S, previously described only as associated to lopinavir resistance. Interestingly, almost all of the interactions found are with residues closely positioned to residue 89 in the protease 3D structure (Fig. 2). Codon 89 is located in a ∝-helix important for the closing of the flaps and the subunit rotation upon binding of the substrate [20–22]. It is of interest to note that based on 3D modelling analysis, codon 89 was, in 2001, identified as a position of potential interest for drug resistance because the interaction of the amino acid at this position could possibly be different for drugs than for the substrate . However, the authors could not show any implication of codon 89 on resistance upon analysis of subtype B sequences. Our analysis suggests that this codon is indeed important, as anticipated, but only in the context of some non-B subtypes.
Calazans et al. have recently suggested that the L89M polymorphism in subtype F causes resistance to PIs. However, this conclusion is corroborated by only two of the four isolates that they analysed. On the other hand, they suggest that the L89I mutation does not have a phenotypic effect and that it is probably only due to the G→A hypermutation rate . In contrast to Calazans et al., our results of the phenotypic analysis of clinical samples suggest that M89I/V has an important effect on phenotype in the analysed subtypes. Whereas it seems to have a sensitizing effect on drug susceptibility when present in otherwise WT strains, in patients with L90M, the addition of M89I/V caused a statistical significant increase in nelfinavir resistance. These results suggest a role of M89I/V as secondary PI mutation with phenotypic impact only in the presence of primary PI mutations at least for subtypes C, F and G, similar to the effect of the currently known secondary PI mutations. Therefore, the presence of 89I/V should be considered in algorithms for the analysis of drug-resistance testing.
The authors would like to sincerely thank Antónia Turkman and Wendim Ghidey, for helpful advice on the statistical methodology used.
Sponsorship: A.A. was supported by Fundação para a Ciência e Tecnologia (Grant no. SFRH/BD/19334/2004), Associação Portuguesa para o Estudo Clínico da SIDA (APECS) and Gilead Sciences Lda. K.D. was funded by a PhD grant of the Institute for the Promotion of Innovation through Sciences and Technology in Flanders (IWT). This work was supported in part by FWO-Vlaanderen grant G.0266.04, and by the Katholieke Universiteit Leuven through Grant OT/04/43.
Presentation of preliminary results: A. Abecasis, et al. A novel mutation selected by protease inhibitor therapy in subtype G, but not in subtype B-infected patients. XII International HIV Drug Resistance Workshop, June 2003, Los Cabos, Mexico.
1. Vandamme AM, Sonnerborg A, Ait-Khaled M, Albert J, Asjo B, Bacheler L, et al
. Updated European recommendations for the clinical use of HIV drug resistance testing. Antivir Ther 2004; 9:829–848.
2. Hirsch MS, Brun-Vezinet F, Clotet B, Conway B, Kuritzkes DR, D'Aquila RT, et al
. Antiretroviral drug resistance testing in adults infected with human immunodeficiency virus type 1: 2003 recommendations of an International AIDS Society-USA Panel. Clin Infect Dis 2003; 37:113–128.
3. Gomes P, Diogo I, Gonçalves MF, Carvalho P, Cabanas J, Camacho R. Different pathways to nelfinavir genotypic resistance in HIV-1 subtypes B and G.Ninth Conference on Retroviruses and Opportunistic Infections
, Seattle, WA, February 2002 [abstract 46].
4. Kantor R, Zijenah LS, Shafer RW, Mutetwa S, Johnston E, Lloyd R, et al
. HIV-1 subtype C reverse transcriptase and protease genotypes in Zimbabwean patients failing antiretroviral therapy. AIDS Res Hum Retroviruses 2002; 18:1407–1413.
5. Ariyoshi K, Matsuda M, Miura H, Tateishi S, Yamada K, Sugiura W. Patterns of point mutations associated with antiretroviral drug treatment failure in CRF01_AE (subtype E) infection differ from subtype B infection. J Acquir Immune Defic Syndr 2003; 33:336–342.
6. Kantor R, Katzenstein DA, Efron B, Carvalho AP, Wynhoven B, Cane P, et al
. Impact of HIV-1 subtype and antiretroviral therapy on protease and reverse transcriptase genotype: results of a global collaboration. PLoS Med 2005; 2:112.
7. Hsu LY, Subramaniam R, Bacheler L, Paton NI. Characterization of mutations in CRF01_AE virus isolates from antiretroviral treatment-naive and -experienced patients in Singapore. J Acquir Immune Defic Syndr 2005; 38:5–13.
8. Brenner B, Turner D, Oliveira M, Moisi D, Detorio M, Carobene M, et al
. A V106M mutation in HIV-1 clade C viruses exposed to efavirenz confers cross-resistance to non-nucleoside reverse transcriptase inhibitors. AIDS 2003; 17:F1–F5.
9. Caride E, Hertogs K, Larder B, Dehertogh P, Brindeiro R, Machado E, et al
. Genotypic and phenotypic evidence of different drug-resistance mutation patterns between B and non-B subtype isolates of human immunodeficiency virus type 1 found in Brazilian patients failing HAART. Virus Genes 2001; 23:193–202.
10. Gonzales MJ, Machekano RN, Shafer RW. Human immunodeficiency virus type 1 reverse-transcriptase and protease subtypes: classification, amino acid mutation patterns, and prevalence in a northern California clinic-based population. J Infect Dis 2001; 184:998–1006.
11. Montes B, Vergne L, Peeters M, Reynes J, Delaporte E, Segondy M. Comparison of drug resistance mutations and their interpretation in patients infected with non-B HIV-1 variants and matched patients infected with HIV-1 subtype B. J Acquir Immune Defic Syndr 2004; 35:329–336.
12. Abecasis A, Gomes P, Derdelickx I, Carvalho AP, Diogo I, Gonçalves F, et al
. A novel mutation selected by protease inhibitor therapy in subtype G, but not in subtype B infected patients. Preliminary results.XII International HIV Drug Resistance Workshop
, Mexico, June 2003 [abstract 123].
13. Calazans A, Brindeiro R, Brindeiro P, Verli H, Arruda MB, Gonzalez LM, et al
. Low accumulation of L90M in protease from subtype F HIV-1 with resistance to protease inhibitors is caused by the L89M polymorphism. J Infect Dis 2005; 191:1961–1970.
14. Vandamme A-M, Witvrouw M, Pannecouque C, Balzarini J, Van Laethem K, Schmit J-C, et al
. Evaluating clinical isolates for their phenotypic and genotypic resistance against anti-HIV drugs. In: Kinchington D, Schinazi RF, editors. Methods in molecular medicine: antiviral methods and protocols. Totowa, NJ: Humana Press Inc; 1999. pp. 223–258.
15. Snoeck J, Van Dooren S, Van Laethem K, Derdelinckx I, Van Wijngaerden E, De Clercq E, et al
. Prevalence and origin of HIV-1 group M subtypes among patients attending a Belgian hospital in 1999. Virus Res 2002; 85:95–107.
16. Johnson VA, Brun-Vezinet F, Clotet B, Conway B, D'Aquila RT, Demeter LM, et al
. Update of the drug resistance mutations in HIV-1: 2004. Top HIV Med 2004; 12:119–124.
17. Deforche K, Camacho R, Grossman Z, Soares MA, Shafer RW, Van Laethem K, et al
. Applying Bayesian networks to study nelfinavir resistance pathways in subtypes A, B, C, F and G. Third European Drug Resistance Workshop
. Athens, Greece, March–April 2005 [abstract 6].
18. Heckerman D. A tutorial on learning with Bayesian networks. In: Jordan M, editor. Learning in graphical models. Cambridge, Massachusetts: MIT Press; 1999. pp. 301–354.
19. R Development Core Team (2004). R: A language and environment for statistical computing
. Vienna, Austria: R Foundation for Statistical Computing; URL http://www.R-project.org
20. Wang W, Kollman PA. Computational study of protein specificity: the molecular basis of HIV-1 protease drug resistance. Proc Natl Acad Sci USA 2001; 98:14937–14942.
21. Velazquez-Campoy A, Todd MJ, Vega S, Freire E. Catalytic efficiency and vitality of HIV-1 proteases from African viral subtypes. Proc Natl Acad Sci USA 2001; 98:6062–6067.
22. Rose RB, Craik CS, Stroud RM. Domain flexibility in retroviral proteases: structural implications for drug resistant mutations. Biochemistry 1998; 37:2607–2621.