INTRODUCTION
HIV gains entry into a cell through the use of its envelope protein, gp120, which binds to the human CD4 receptor and a coreceptor-either CXCR4 (X4 HIV) or CCR5 (R5 HIV).1 More advanced disease progression is often associated with CXCR4 tropism and detectable X4 viral load.1 Further, with the emergence of the CCR5-antagonist antiretroviral drug class (eg, maraviroc2 ), coreceptor usage has become more clinically relevant because the efficacy of these drugs is dependent upon the patients having R5 HIV.3,4 Many tests are available to screen for tropism, each with its own advantages and disadvantages.5 The Trofile coreceptor assay (Monogram Biosciences)6 and its more sensitive adaptation, ESTA,7 are most commonly used.
Genotypic screening methods for determining coreceptor usage have the potential to be faster, less expensive, and more easily standardized than current phenotypic methods.8 This approach is possible because viral tropism is reflected in the genetic sequence of the gp160 protein, with its third variable domain, or V3 loop, being particularly predictive of coreceptor usage.9 Bioinformatic algorithms use V3 sequence data to predict coreceptor phenotype,10,11 however, standard population-based V3 sequencing may lack sensitivity for minority X4 HIV.12
We sought to improve detection of X4 HIV using a number of genotypic tropism methods and compare them to Trofile assay results. Standard population-based sequencing and “deep” sequencing were performed on triplicate amplifications of the HIV V3 region. Amplifications were made from both viral RNA in plasma and from integrated proviral DNA in peripheral blood mononuclear cells (PBMCs), where plasma viral load was undetectable. Tropism was inferred using bioinformatic algorithms, and the results from these various genotypic methods were compared with those of the Trofile assay. We then validated our approach in an independent dataset of screening samples from Pfizer's Maraviroc versus Optimized Therapy in Viremic Antiretroviral Treatment-Experienced (MOTIVATE) studies of maraviroc.
METHODS
Cohort Description and Patients
V3 loop sequence variation was assessed in samples from a cohort of antiretroviral-naive chronically infected individuals initiating antiretroviral therapy. The primary study group represents a subset (n = 63 patients) of the well characterized HAART Observational Medical Evaluation and Research (“HOMER”) cohort.13 Individuals were included in the present study by convenience, based on the availability of a peripheral blood sample for polymerase chain reaction (PCR) amplification and a documented Trofile assay result.14 Ethical approval was granted by the Providence Health Care/University of British Columbia Research Ethics Board.
Extraction and Population-Based Sequencing
HIV RNA was extracted from previously frozen plasma samples, and HIV DNA was extracted from buffy coat samples, both using a NucliSENS easyMAG (bioMerieux). Both RNA sequencing methods (ie, population-based, and “deep”) followed the same procedures up to and including first round PCR, but differed in later steps, such as using different second round PCR primers. The region encoding the HIV V3 loop was amplified independently in triplicate by nested reverse transcriptase-polymerase chain reactions set up simultaneously from extracts using a multichannel pipette-the additional effort is minimal compared with a single PCR. Triplicates were performed as a compromise between potential increased probability for amplification of X4 HIV and a procedure that is clinically feasible for a technician to perform on a single PCR plate. Sequencing was performed in the 5' and 3' directions on an ABI 3730 automated sequencer as previously described.15 All primers and thermal cycler protocols for all methods can be viewed online (see Supplemental Digital Content 1, https://links.lww.com/QAI/A36 ).
“Deep” sequencing on the Roche/454 Life Sciences “Genome Sequencer-FLX” (GS-FLX) is a sensitive sequencing technique able to detect low-frequency subpopulations of virus and generate thousands of sequences from a given sample.16,17 Second round PCR primers were designed with fusion primers to fuse to the emulsion PCR beads required by the 454 technique. Also included were 12 unique multiplex “barcode” sequence tags to enable the identification of samples after the sequencing was complete. After PCR amplification, the concentrations of the PCR products were quantified using a Quant-iT Picogreen dsDNA Assay Kit (Invitrogen) and a DTX 880 Multimode Detector (Beckman Coulter, Brea, CA). Triplicate PCRs were then combined in equal proportions (2 × 1012 DNA amplicons from each triplicate sample), purified with Agencourt Ampure PCR Purification beads (Beckman Coulter, Brea, CA), and requantified. This DNA “library” was then diluted to a concentration of 2 × 105 molecules per millilitre, and combined at a ratio of 0.6 molecules:1 DNA capture microbead. Emulsion PCR was performed, and the DNA and beads were washed, purified and prepared for pyrosequencing according to the manufacturer's instructions.
The DNA beads were then added onto the 454 pyrosequencing plate (divided into 4 regions) at a density of 250,000 beads per region, as quantified with a Z1 Coulter Particle Counter (Beckman Coulter). The sequence amplified on each bead was determined by pyrosequencing on the GS-FLX.16,18 This process generated ∼200 base pairs of data in each direction per amplicon,19 with a typical V3 loop consisting of 105 base pairs (35 amino acids). Truncated reads (defined as sequences missing ≥4 bases at the 5' or 3' end) were not included in the analysis. In total, 12 HOMER plasma samples underwent “deep” sequencing with the GS-FLX.
Proviral DNA
Proviral HIV DNA V3 sequences were assessed in a similar manner in 26 X4/DM and 14 R5 HOMER subjects after pVL had become undetectable as a result of highly active antiretroviral therapy (HAART).15 Sample material consisted of PBMC from the buffy coat fraction of centrifuged whole blood. Patient tropism had previously been determined using Trofile before initiating HAART. Nested PCR was performed for bulk sequencing on the ABI 3730. “Deep” sequencing used the same first-round primers as bulk sequencing, with different second-round primers. Proviral DNA from a total of 12 buffy coat samples from the HOMER cohort underwent “deep” sequencing on the GS-FLX.
Sequence Analysis and Coreceptor Determination by Bioinformatic Algorithms
After sequencing on the ABI 3730, data were analyzed using the custom software, RE_Call19 with no manual intervention. RE_Call has been shown to have ∼99% concordance with human calls.19 Nucleotide mixtures were automatically called if the secondary peak height exceeded 12.5% of the dominant peak height. Sequences were aligned to HIV-1 subtype B reference strain HXB2 (Genbank Acc. No. K03455) using a modified nucleotide-amino acid alignment program (NAP) algorithm.20 HIV tropism was predicted from V3 genotype using position-specific scoring matrices (PSSMX4/R5 )10 and/or geno2pheno[coreceptor] (g2p)11 scoring. Nongenotypic factors such as CD4+ cell count were not included in the bioinformatic analysis. Results were compared with the Trofile data as a reference. Standard sequencing replicates with PSSM values below the predetermined cutoff of −6.96 were called r5, whereas those with scores greater than or equal to −6.96 were called x4.10 Note that the lowercase letters were used for these classifications to indicate that tropism had been inferred from genotypic data. The g2p method11 used a 5% false-positive rate, with samples also categorized as r5 or x4. Where sequence ambiguity occurred due to the presence of nucleotide mixtures, the permutation with the highest PSSM score was used to assign the score for a given replicate, to increase the sensitivity for detection of x4 variants.12 A similar system was used for g2p. Where triplicate data differed, the most highly X4 (eg, maximum PSSM score) replicate was assigned to a sample. Thus, r5 samples had all 3 replicates inferred as r5, and x4 samples had at least 1 inferred as x4. For “deep” sequencing, each V3 variant detected received a tropism classification using the same 2 algorithms. This allowed the proportion of x4 virus within the sample to be determined, and samples were classified according to this parameter. Sensitivity was defined as the prediction of CXCR4 usage, which correlated with the original Trofile assay.
Independent Validation
The performance of these methods were assessed in a blinded independent dataset (n = 278) from Pfizer screening samples (MOTIVATE studies). The MOTIVATE screening samples were amplified in triplicate and sequenced on the ABI 3730.
RESULTS
Standard Sequencing of V3 to Infer Tropism
Standard population-based sequencing of triplicate amplifications of the V3 loop, in combination with PSSM tropism inference, gave approximately 81% concordance with Trofile. Of samples called R5 by Trofile, 31 of 34 (91%) were also identified as r5 by standard sequencing. Of those called Dual-/Mixed-Tropic (DM) or X4 by Trofile, 20 of 29 (69%) were inferred as x4 (Fig. 1A ). Often, there was notable variation among the 3 independent amplifications performed for each sample. Indeed, 12 of 63 samples (19%) had at least 1 replicate indicate a different tropism than the others. Almost half of Trofile X4/DM samples that were called x4 by standard sequencing had at least 1 amplification that would have been classified as r5 if the triplicate approach had not been used (9 of 20 samples, 45%). It is unknown what effect replicate testing of clinical samples would have had on Trofile assay results. Receiver operator characteristic (ROC) curves were plotted using the maximum of 3 triplicates, and a “singleton” approach using only the first of the triplicates. The area under the curve of the ROC curve for the triplicate approach was 0.874 versus 0.828 for the singleton approach. Overall, comparing the result (r5 or x4) by standard sequencing with PSSM, and using the Trofile result as a reference, the sensitivity for the HOMER samples was 69% and specificity was 91%. In comparison, keeping specificity constant, a singleton approach would have given 48% sensitivity, and a duplicate approach, 59% sensitivity, relative to Trofile.
FIGURE 1: A, PSSM scores from standard sequencing of independent triplicate PCRs of V3 amplified from plasma HIV RNA: the horizontal axis represents the possible PSSM scores, with scores to the left of the dashed vertical line (−6.96) indicating r5 virus, and scores to the right of −6.96 indicating x4 virus. Samples are arranged by their Trofile screening result, with the Trofile X4/DM samples in the upper region of the figure and the Trofile R5 samples in the lower region. Closed circles indicate the PSSM score of each of 3 replicate amplifications of the V3 loop for different samples. Horizontal lines span the range of the 3 scores to give an indication of the diversity within a respective sample. Where 3 circles are not visible, this is either because of a failed amplification or because the points overlap because of similar or identical PSSM scores. B, PSSM score from HIV proviral DNA by standard sequencing: possible PSSM scores on the X-axis. Scores left of the dashed vertical line (−6.96) indicate r5 virus; scores to the right indicate x4 virus. Samples are arranged by their Trofile status. Note that the Trofile result is based on plasma RNA and not proviral DNA. Closed circles indicate the PSSM score of each V3 amplification for different samples. Horizontal lines span the range of the 3 scores for each sample. Where 3 circles are not visible, this is either because of a failed amplification or because the circles overlap due to similar or identical PSSM scores.
Proviral DNA to Infer Tropism
PBMC samples were retrieved from patients who were currently on HAART (without CCR5-antagonist medication), had undetectable plasma viral loads at the time of sampling and for whom a pretherapy plasma sample and Trofile assay result were available. Of 46 samples initially attempted, 40 samples yielded successful amplifications, giving an 87% amplification rate. Proviral DNA samples were amplified in triplicate, bulk sequenced on the ABI 3730, and inferred as r5 or x4 by PSSM (Fig. 1B ). For samples called DM by Trofile from plasma RNA, 20 of 26 (77%) had evidence of x4 HIV DNA. A total of 10 of 14 samples (71%) called R5 by Trofile had r5 sequences in their corresponding proviral DNA, giving sensitivities and specificities of 77% and 71%, respectively, for PSSM; or 77% and 93%, respectively, for g2p (data not shown). The mean proviral DNA PSSM score of each sample was also correlated to the pretreatment RNA PSSM scores (r 2 = 0.35).
“Deep” Sequencing of HIV RNA and DNA
A subset of patients with matching plasma RNA and proviral DNA (n = 12) samples were sequenced using the GS-FLX (Table 1 ). The RNA was extracted from plasma samples drawn before initiation of antiretroviral therapy, whereas the DNA was extracted from buffy coat samples drawn after patients achieved pVL <50 copies per milliliter, after a median of 36.5 months (interquartile range: 30.5-39) on therapy. These samples were assessed according to the percentage of x4 virus comprising their “deep” sequencing results. A total of 4 of the 12 patients (33%) had very similar proportions of x4 virus in their plasma RNA and proviral DNA, (within ∼1% of each other). For the remaining 8 samples, the percent x4 in RNA and proviral DNA differed by a range of 6%-72%, with proviral DNA tending to harbour a higher percentage of x4 variants (median 46% x4 in DNA vs. 8% in RNA).
TABLE 1: “Deep” Sequencing of HIV RNA and DNA Compared With Standard Sequencing and Trofile
Overall, the “deep” sequencing percent x4 from pretreatment plasma RNA and postsuppression proviral DNA were well correlated (r 2 = 0.44) and also corresponded very well to the pretreatment Trofile results. Using RNA and DNA, respectively, 4 of 4 and 3 of 4 samples called R5 by Trofile had <2% x4 virus comprising their "deep" sequencing results, whereas 8 of 8 and 7 of 8 samples called DM by Trofile had >2% x4 virus comprising their “deep” sequencing results. Standard sequencing of RNA and DNA gave x4 calls in 11 of 12 samples (92%) that had ≥20% x4 by “deep” sequencing, and gave r5 calls in 9 of 12 (75%) samples with <20% x4, consistent with the typical sensitivity of standard sequencing in reliably detecting minority species. Indeed, the presence of low-level (<20%) x4 variants could explain 3 of 4 (75%) Trofile-DM samples, which were apparently misclassified as r5 by standard sequencing.
Independent Validation
The sensitivity and specificity of the current approach were ascertained on a blinded independent sample set (n = 278) from the Pfizer MOTIVATE trials, which tested maraviroc in treatment-experienced individuals. A previous attempt by our laboratory to determine tropism by standard sequencing (not in triplicate) with PSSM methods had only 24% sensitivity with 97% specificity when compared with Trofile.12 The independent validation of our current method with bioinformatic analysis using PSSM yielded substantially increased sensitivity (75%) with only a modest decline in specificity (83%). Compared with PSSM, g2p methods yielded a slightly worse sensitivity (61%) but with improved specificity (93%), though there was limited power to distinguish either algorithm as superior.
DISCUSSION
This analysis of the tropism of clinical samples using standard population-based sequencing and “deep” sequencing of the HIV V3 region shows higher sensitivity for detecting CXCR4-using virus in samples than previously achieved by our group.12 Of additional significance was the use of proviral DNA to infer viral tropism in treated patients with undetectable plasma viral loads. “Deep” sequencing also seemed to be a good predictor of Trofile results, with a cutoff of 2% x4 giving good concordance with the original Trofile. Relatively few studies have used proviral DNA21 or “deep” sequencing23 to infer tropism. Genotypic tropism testing from proviral DNA suggests the possibility of screening for those with suppressed pVL who may wish to switch to CCR5 antagonists for reasons such as tolerability, whereas the Trofile assay requires a pVL >1000 copies per milliliter.7 With the above outlined approach, most patients harbouring X4 virus can be quickly screened out.
The improved results of the current study may be attributable to a number of factors, especially: triplicate amplification, better sequencing technology, and automatic base-calling. Independent triplicate amplifications may be able to amplify a greater proportion of minority species due to the inherently stochastic nature of PCR, as evidenced by the larger area under the curve for the ROC curve and better sensitivity compared with a “singleton” approach, and the variability amongst the replicates with ∼20% of samples yielding replicates with different inferred coreceptor usage. The sequencing hardware and chemistry has improved, with the ABI 3730 used for the current study, and either the ABI 3100 or 3700 used for the earlier study; this may have yielded higher quality sequence data. The GS-FLX also represents a further advance in sequencing technology. An additional advantage in the current method is the automated nature of the sequence analysis. Due to analysis by RE_Call, sequence data underwent no manual intervention such as “base-calling”, which made this method quick and efficient while bypassing the inherently inconsistent and labour-intensive process of manual sequence analysis by a technician.
Some limitations of the sample population should be noted. The first 63 available samples as organized by sample ID were arbitrarily chosen, resulting in a potentially unknown selection bias. Further, the clinical test set from British Columbia is composed of 97.5% clade-B virus, thus skewing the results in favor of methods trained primarily on clade-B.23,24 The PSSM algorithm used here may not be readily extendable to nonclade B sequences. It should also be noted that the original Trofile assay was used for these analyses but not the enhanced sensitivity Trofile assay, “ESTA”. Our results may have differed if these genotypic methods were compared with “ESTA”-for instance, some of the Trofile R5 samples may have yielded X4/DM results if tested by ESTA, which may have decreased specificity. The “true” sensitivity of genotypic tropism testing is confounded by the “gold standard” against which these tests are compared. Numerous studies have compared a variety of genotypic and phenotypic tests, each yielding varying sensitivities and specificities (eg,25-27 ). Concordance even between phenotypic tropism assays is not necessarily 100% (eg,26 ). Furthermore, depending on the tests used, genotypic sensitivity for X4 variants has ranged from as low as 10% for genotypic predictors, such as the 11/25 or charge rule,26 to ∼70% for support vector machines25 and g2p.26 Even the same algorithm used on different datasets can yield vastly different sensitivities.27 Because of the wide range of sensitivities reported from genotypic testing, our results should be taken in this context.
Most importantly, using the Trofile call as the reference may be problematic. Ultimately, the best indication against which results should be compared is the virological outcome of patients who receive CCR5-antagonist medication. Clinical outcome, and not other assays, is the best candidate for the “gold standard” of comparison.28
REFERENCES
1. Berger EA, Murphy PM, Farber JM. Chemokine receptors as HIV-1 coreceptors: roles in viral entry, tropism, and disease.
Annu Rev Immunol . 1999;17:657-700.
2. Dorr P, Westby M, Dobbs S, et al. Maraviroc (UK-427,857), a potent, orally bioavailable, and selective small-molecule inhibitor of chemokine receptor CCR5 with broad-spectrum anti-human immunodeficiency virus type 1 activity.
Antimicrob Agents Chemother . 2005;49:4721-4732.
3. Westby M, Lewis M, Whitcomb J, et al. Emergence of CXCR4-using human immunodeficiency virus type 1 (HIV-1) variants in a minority of HIV-1-infected patients following treatment with the CCR5 antagonist Maraviroc is from a pretreatment CXCR4-using virus reservoir.
J Virol . 2006;80:4909-4920.
4. Perno CF, Moyle G, Tsoukas C, et al. Overcoming resistance to existing therapies in HIV-infected patients: the role of new antiretroviral drugs.
J Med Virol . 2008;80:565-576.
5. Coakley E, Petropoulos CJ, Whitcomb JM. Assessing chemokine co-receptor usage in HIV.
Curr Opin Infect Dis . 2005;18:9-15.
6. Whitcomb JM, Huang W, Fransen S, et al. Development and characterization of a novel single-cycle recombinant-virus assay to determine human immunodeficiency virus type 1 coreceptor tropism.
Antimicrob Agents Chemother . 2007;51:566-575.
7. Reeves JD, Coakley E, Petropoulos CJ, et al. An enhanced-sensitivity Trofile™ assay.
J Viral Entry . 2009;3:94-102.
8. Lengauer T, Sander O, Sierra S, et al. Bioinformatics prediction of HIV coreceptor usage.
Nat Biotechnol . 2007;25:1407-1410.
9. Briggs D, Tuttle D, Sleasman J, et al. Envelope V3 amino acid sequence predicts HIV-1 phenotype (co-receptor usage and tropism for macrophages).
AIDS . 2000;14:2937-2939.
10. Jensen MA, Li FS, van't Wout AB, et al. Improved coreceptor usage prediction and genotypic monitoring of R5-to-X4 transition by motif analysis of human immunodeficiency virus type 1 env V3 loop sequences.
J Virol . 2003;77:13376-13388.
11. Sing T, Low AJ, Beerenwinkel N, et al. Predicting HIV co-receptor usage based on genetic and clinical covariates.
Antivir Ther . 2007;12:1097-1106.
12. Low AJ, Dong W, Chan D, et al. Current V3 genotyping algorithms are inadequate for predicting X4 co-receptor usage in clinical isolates.
AIDS . 2007;21:F17-F24.
13. Hogg RS, Yip B, Chan KJ, et al. Rates of disease progression by baseline CD4 cell count and viral load after initiating triple-drug therapy.
JAMA . 2001;286:2568-2577.
14. Brumme ZL, Goodrich J, Mayer HB, et al. Molecular and clinical epidemiology of CXCR4-using HIV-1 in a large population of antiretroviral-naive individuals.
J Infect Dis . 2005;192:466-474.
15. Brumme ZL, Dong WWY, Yip B, et al. Clinical and immunological impact of HIV envelope V3 sequence variation after starting initial triple antiretroviral therapy.
AIDS . 2004;18:F1-F9.
16. Archer J, Braverman MS, Taillon BE, et al. Detection of low-frequency pretherapy chemokine (CXC motif) receptor 4 (CXCR4)-using HIV-1 with ultra-deep pyrosequencing.
AIDS . 2009;23:1209-1218.
17. Droege M, Hill B. The genome sequencer FLXTM System-Longer reads, more applications, straight forward bioinformatics and more complete data sets.
J Biotechnol . 2008;136:3-10.
18. Bushman FD, Hoffman C, Ronen K, et al. Massively parallel pyrosequencing in HIV research.
AIDS . 2001;22:1411-1415.
19. Brooks JI, Woods C, Merks H, et al. Evaluation of a web-based automated sequence analysis tool to standardize HIV genotyping results. Presented at: 49th International Conference on Antimicrobial Agents and Chemotherapy; September 12-15, 2009; San Francisco, CA. Abstract.
20. Huang X, Zhang J. Methods for comparing a DNA sequence with a protein sequence.
Comput Appl Bio Sci . 1996;12:497-506.
21. Simmonds P, Zhang LQ, McOmish F, et al. Discontinuous sequence change of human immunodeficiency virus (HIV) type 1 env sequences in plasma viral and lymphocyte-associated proviral populations in vivo: implications for models of HIV pathogenesis.
J Virol . 1991;65:6266-6276.
22. Tsibris AMN, Korber B, Arnaout R, et al. Quantitative deep sequencing reveals dynamic HIV-1 escape and large population shifts during CCR5 antagonist therapy in vivo.
PLoS ONE . 2009;4:e5683.
23. Geretti AM, Harrison L, Green H, et al. Effect of HIV-1 subtype on virologic and immunologic response to starting highly active antiretroviral therapy.
Clin Infect Dis . 2009;48:1296-1305.
24. Geretti AM. HIV-1 subtypes: epidemiology and significance for HIV management.
Curr Opin Infect Dis . 2006;19:1-7.
25. Skrabal K, Low AJ, Dong W, et al. Determining human immunodeficiency virus coreceptor use in a clinical setting: degree of correlation between two phenotypic assays and a bioinformatic model.
J Clin Microbiol . 2007;45:279-284.
26. de Mendoza C, Van Baelen K, Poveda E, et al. Performance of a population-based HIV-1 tropism phenotypic assay and correlation with V3 genotypic prediction tools in recent HIV-1 seroconverters.
J Acquir Immune Defic Syndr . 2008;48:241-244.
27. Raymond S, Delobel P, Mavigner M, et al. Correlation between genotypic predictions based on V3 sequences and phenotypic determination of HIV-1 tropism.
AIDS . 2008;22:F11-F16.
28. Harrigan PR, McGovern R, Dong W, et al. Screening for HIV tropism using population-based V3 genotypic analysis: a retrospective virological outcome analysis using stored plasma screening samples from MOTIVATE-1. Presented at: 5
th International AIDS Society Conference on HIV Pathogenesis, Treatment and Prevention; July 19-22, 2009; Cape Town, South Africa. Abstract.