Introduction
The rapid rate of evolutionary change of HIV has led to the successful application of a variety of phylogenetic approaches. Both within and among hosts, phylogenetic methods have reconstructed the HIV evolutionary history with great precision [1]. On an epidemiological scale, molecular phylogenetics have provided insights into the origin of HIV [2,3], its epidemic history [4-6], HIV migration, routes of infection [7] and HIV transmission among epidemiologically related patients [8,9]. At the intra-host level, insights into molecular evolutionary processes have become of particular medical interest. For example, phylogenetics have been applied to study HIV compartmentalization [10], drug resistance [11], disease progression [12] and to indicate reservoirs of dormant viruses [13]. A wealth of HIV sequence data has accumulated over time, making the immunodeficiency viruses the most data-rich group of organisms for evolutionary analyses [14]. Moreover, significant evolutionary change can be detected in molecular sequences sampled over time, through which HIV is now considered as a measurably evolving population [15]. This has spawned a lot of research effort including the unification of phylogenetics and population genetics [15,16].
Phylogenetic analysis cannot be reduced to merely reconstructing a tree topology; the phylogeny provides a powerful framework in which several hypotheses can be tested [17]. Among the various applications for HIV, forensic investigations have received special attention in scientific literature. In such cases, the null hypothesis that viral sequences sampled from the presumed recipient(s) are not more closely related to the sequences from the alleged donor than to appropriate control sequences, is tested using phylogenetic inference. Molecular investigations for forensic purposes have been introduced by the well-known Florida dentist case in 1992 [9]. This study revealed that six patients became infected with HIV-1 while receiving care from an HIV-1-positive dentist [9]. These findings had major implications for a settlement out-of-court and they raised a lot of controversy and criticism upon publication [18-23]. Molecular evidence of HIV transmission was used for the first time in court in a Swedish rape case [24]. In Sweden, this case was soon followed by several other forensic investigations as, in that country, it is criminal to deliberately transmit HIV or to expose someone to HIV infection without informing them about the risk (for a review, see [25]). Similar epidemiological investigations in various settings have also been conducted in other countries [e.g. 26,27], only few of which were part of criminal investigations [28,29]. Recently, the first case has been reported in which phylogenetic analysis was used as evidence in a United States criminal proceeding [30]. Almost all investigations of criminal cases with a clear a priori hypothesis involved a single donor-recipient transmission event. For forensic purposes, Leitner and Albert (2000) have outlined a procedure for reliable transmission chain reconstruction. This included double sampling of the patients under investigation, sampling a reasonable number of local controls based on epidemiological and subtype criteria, direct population sequencing of two gene regions and maximum-likelihood phylogenetic reconstruction [25].
Here, we report the molecular analysis of a possible HIV-1 transmission case, in which six females presumably became HIV infected subsequent to a sexual assault. In connection with professional activities in Europe, the suspect, who emigrated from Rwanda in 1992, was in contact with the victims for several periods between the beginning of 1993 and the end of 1995. For five of the victims, seroconversion could be determined between 1993 and 1995. At the time of the first sampling for the court investigation, the suspect was found to be HIV seropositive. We present here the molecular epidemiological investigation related to this case.
Materials and methods
Patient samples
At least two different blood or plasma samples were available for the suspect (S) and the six victims (VA, VB, VC, VD, VE, VF). Upon arrival, the patient samples were anonymously labelled and subsequently handled by two other researchers who independently amplified and sequenced different gene regions (see below). This strategy enabled us to exclude the possibility of laboratory error and to remove the potential for investigator bias. Local controls were collected from two local hospitals, approval from the ethical committee was received, and the patients gave their informed consent. Patients were selected based on the available epidemiological information from medical files of the suspect and the victims. The primary epidemiological criteria included the same geographical area (preferentially from the same hospitals as attended by victims), diagnosed as HIV infected between 1992 and 1996 (one year below or above the time range of alleged transmission events), age-matching and risk group (heterosexual). In an attempt to include around 30 local controls, the patients were chosen to conform to at least one but preferentially all of these criteria. A heteroduplex mobility assay for several victim samples determined that these patients were all infected with HIV subtype A (F.B.-V., personal comm.). In addition, the subtype was confirmed using a preliminary analysis of partial gag sequences from the suspect and some of the victims (S.V.D., personal comm.). Therefore, a maximum of local controls of subtype A were also included regardless of the primary epidemiological criteria. In addition to the collection of local controls, sequences that were most similar to those from the suspect and victims were retrieved from Genbank using BLAST [31].
RNA/DNA extraction, polymerase chain reaction and DNA sequencing
Viral sequences from both plasma and cells were used for the molecular analyses. RNA was extracted from plasma using the Nuclisens isolation kit (Organon Teknika, Boxtel, The Netherlands). For genetic characterization of the pol gene, the extraction was performed according to the ViroSeq HIV-1 Genotyping System (Celera Diagnostics, Alameda, California, US). DNA was extracted from lymphocytes, whole blood or cell lysates, using the QIAamp DNA Blood Mini kit (Westburg, Leusden, The Netherlands). We performed polymerase chain reaction (PCR) and subsequent sequencing of two regions in the HIV-1 genome: the V2-V4 region of the env gene, and the protease (PRO) and partial reverse transcriptase (RT) of the pol gene.
For amplification of the V2-V4 region of the env gene, cDNA synthesis was performed using the GeneAmp RNA PCR kit (Applera, Nieuwerkerk a/d Ijssel, The Netherlands) according to the manufacturers' instructions. Amplification was performed using the primers PLA5917 [5′-ACA GAC CC(C/T) AAC CCA CAA GAA-3′] and AV311 [5′-CTA CTT TAT A(C/T)T TAT ATA ATT CAC TTC TCC-3′] for the first PCR and primers PLA6925 [5′-CTG CCA CAT (A/G)TT TA(C/T) AAT TTG-3′] and PLA6045 [5′-TGT AAA GTT AAC (C/T)CC TCT CTG-3′] for the nested PCR, yielding a 880-bp amplification fragment encoding amino acid 90-383 of gp120.
For amplification of the protease and partial reverse transcriptase, PCR was performed using the ViroSeq HIV-1 Genotyping System (Celera Diagnostics) according to manufacturer's instructions. In case of failure of the commercial kit, an in-house procedure was used [32]. This resulted in a 1284 bp pol fragment covering the complete PRO gene (encoding AA 1 to AA 99) and part of the RT gene (encoding AA 1 to AA 329) generally used for resistance genotyping.
The PCR products were purified, prior to sequencing, using Microcon-100 microconcentrators (Celera Diagnostics) or the QIAquick PCR purification kit (Westburg) according to the manufacturer's instructions. Direct sequencing of the nested PCR products was performed on both DNA strands of all PCR fragments using the ViroSeq HIV-1 Genotyping System (Celera Diagnostics) or the ABI PRISM BigDye Terminator Ready Reaction Cycle Sequencing Kit (Applera) and collected on the ABI PRISM 310 sequencer (Applera). The sequences were analysed using Sequence Analysis version 3.7 (Applera), and either ViroSeq HIV-1 Genotyping Software version 2.2 (Celera Diagnostics) or Factura version 1.2.0 and ABI PRISM Sequence Navigator version 1.0.1 (Applera).
Phylogenetic analysis
Sequences were aligned using CLUSTAL W [33] and manually edited according to their codon-reading frame in Se-Al (http://evolve.zoo.ox.ac.uk). Appropriate nucleotide substitution models were determined with Modeltest version 3.06 [34]. Different phylogenetic algorithms were used to examine the relationships between the suspect, the victims and control sequences. Maximum likelihood (ML) phylogenetic trees were reconstructed using three different heuristic searches: (a) in PAUP*(v4b10) using the nearest-neighbour-interchange, subtree-pruning-regrafting and tree-bisection-reconnection branch-swapping algorithms, starting from a neighbour-joining tree with optimized substitution model parameters [35]; (b) in Metapiga using the metapopulation genetic algorithm that implemented probability consensus pruning among four populations, each including four individuals [36,37]; in PhyML using a hill-climbing algorithm that adjusts tree topology and branch lengths simultaneously [38]. Bayesian inference was performed using Metropolis-coupled Markov-chain Monte Carlo (MCMCMC) sampling implemented in MrBayes (version 3.0) [39]. For both genes, four coupled chains were run for 25 × 106 generations (temp = 0.2); trees were sampled every 1000 generations. In addition, twelve coupled chains were run for 8 × 106 generations for pol and eight coupled chains were run for 8 × 106 generations for env. The burn-in was set at 10% of the sampled states. Distance-based trees were reconstructed using the minimum evolution objective in PAUP*(v4b10) implementing the same heuristic search procedure as the ML analysis. Statistical parsimony analyses were performed using the cladogram estimation procedure implemented in TCS (version 1.13) [40,41]. Depending on the tree-reconstruction method used, we assessed the reliability of the phylogenetic hypothesis using non-parametric or parametric bootstrapping.
Results
Setup for the hypothesis testing
At least two different blood or plasma samples from the suspect and each victim were obtained and anonymously labelled. To test the hypothesis of HIV-1 transmission, control samples were selected from two local hospitals based on epidemiological and subtype criteria (see methods). The characteristics of this control group are represented in Table 1. The patients not complying with the temporal and geographical setting are immigrants, mainly originating from Rwanda. These patients were included because they represent controls that better match the suspect, who also emigrated from Rwanda in 1992. Based on subtyping by heteroduplex mobility analysis, local controls of subtype A were also selected regardless of the epidemiological criteria. Collectively, the criteria for inclusion yielded 39 local controls, which were also anonymously labelled.
We were able to determine a pol population sequence for all the control samples and for at least two samples of the suspect and victims, with exception of victim VC (only one pol population sequence). In addition to local controls, we retrieved 45 sequences, most similar to the suspect and victim sequences according to BLAST [31], and 57 subtype reference sequences from Genbank. Two local controls were not included in further phylogenetic analyses since they were identified as recombinant in pol (Table 1). For the suspect and victim samples, at least one population sequence was determined in the partial env gene region. Because of the observation that mainly database sequences were closely related to the pol sequences of the suspect and the victims (see below) and the technical difficulties associated with env population sequencing [42], we chose not to determine env sequences for the same number of local controls. To investigate the epidemiological relationship in this gene region, 78 HIV-1 subtype A sequences were retrieved from Genbank using a sequence similarity search.
Pol analysis
Figure 1 shows the evolutionary history reconstructed from the pol gene sequences using Bayesian inference. In this tree, the viruses carried by the suspect and the victims form a highly supported monophyletic cluster within subtype A (posterior probability = 1.0), suggesting epidemiological linkage. Different isolates from the same patient also cluster together within this transmission chain. The sequences most closely related to this cluster originate exclusively from Rwanda and Uganda. More generally, the closely related strains originate predominantly from Rwanda, Uganda, Kenya and Tanzania, neighbouring countries in East Africa. In this case, this is relevant information since the suspect also originates from Rwanda. Some of the local control sequences, selected because they were identified as subtype A (Table 1), fall into a subcluster that includes reference sequences representing the circulating recombinant form CRF01_AE. Different phylogenetic methods consistently inferred a monophyletic clustering of suspect and victim sequences distinct from local control and database sequences. The statistical support measures provided by these methods are summarized in Table 2. Both parametric and non-parametric bootstrapping provides significant support for the epidemiological relationship between suspect and victims (bootstrap support of 99-100%). The different measures in Table 2 all provide assessments of 'confidence' for the patient-victim clade, and 1 - (bootstrap proportion) can be used as an estimate of the probability of type I error in a test of the a priori hypothesis [30]. Approximate posterior probabilities have sometimes been reported as overconfident, however, non-parametric bootstrapping is usually considered as a conservative approach [43-45]. To assess the potential impact of substitutions induced by drug selective pressure in pol, the same analyses were also performed after exclusion of 46 codon positions associated with antiretroviral resistance [46-48]. The suspect-victim cluster was still well supported by the remaining data (bootstrap support of 92-100%; Table 2).
In addition to 'standard' phylogenetic approaches, we also performed a statistical parsimony analysis using the intraspecific cladogram estimation procedure [40]. This procedure results in a network representation for population-level genealogical information with connections that have a pre-specified cumulative probability of being true. The application of the cladogram estimation procedure to the HIV-1 pol data resulted in the probability of parsimonious connections being supported (P ≥ 0.95) for variant sequences that differ by thirteen or fewer nucleotide substitutions. This statistical criterion allowed for only three networks that interconnect more than two sequences (Fig. 2). All sequences from the suspect and the victims are connected into a single parsimony network. No local controls or database sequences could be connected to his network with a cumulative probability above 0.95.
Env analysis
A phylogenetic tree for the env sequences reconstructed using Bayesian inference is depicted in Fig. 3. Although the env analysis does not include the entire local control set, the results confirm several findings obtained for pol: (a) the sequences from the suspect and the victims make up a distinct cluster in the subtype A phylogeny (posterior probability = 1.0); (b) different sequences obtained for the same patient cluster together; (c) different phylogenetic algorithms provide high statistical support for the 'suspect-victim cluster' (bootstrap support of 95-100%, Table 2); (d) the majority of strains most closely related to the 'suspect-victim' cluster originate from Uganda, Rwanda and Kenya.
Discussion
In this study, we present the molecular analysis related to a criminal case of multiple HIV transmissions. Reconstruction of transmission histories in criminal investigations has been performed in the past and future cases are anticipated [25,49]. The high rate of evolution of HIV, which has turned successful treatment into a challenging endeavour, makes it possible to investigate epidemiological linkage between HIV-infected patients using phylogenetic inference. Based on analyses in the partial pol gene and the partial env gene, we tested whether the viruses sampled from six victims were more closely related to the virus found in the suspect than to any control.
Molecular analyses of HIV transmission are critically dependent on the set of local control sequences. These samples need to represent the virus circulating in the background. In this respect, the analysis in the pol region can be considered as a more appropriate hypothesis test than the analysis in the env region. However, several factors can complicate appropriate sampling of controls. In many European countries, immigration has brought new subtypes into the circulating HIV population [e.g. 50-53]. In our case, the genetic background dramatically changed with a rise in non-B subtypes from 0% in 1983 to 57% in 2001 [54]. In fact, the suspect in the HIV transmission case was an emigrant from Rwanda. Although, it was not known whether the suspect contracted HIV in his country of origin, this questions the suitability of local controls based on geographical criteria. However, our local control set also included sequences obtained from immigrants, of which the majority originated from Rwanda. The migration aspect also highlights the importance of including database controls and therefore, we also consider the env analysis as very meaningful. Interestingly, in both genome regions database sequences and local control sequences sampled from immigrants are most closely related to the suspect-victim cluster. Moreover, these closely related strains mainly originate from neighbouring countries in East Africa, which is also the origin of the suspect.
In accordance with the recommendations for transmission chain reconstruction [25], we also selected local control sequences based on their subtype and we included database sequences based on a similarity search. From a statistical viewpoint, it might appear as counterintuitive to select controls using genetic criteria in order to test a hypothesis based on genetic data. However, this strategy reflects the effort to reduce the type I error, which occurs when a 'true' null hypothesis is rejected (more specifically, when suspect and victims would wrongly cluster together in a tree). This agrees with the line of thought in the criminal court system, where a type I error is considered more important than a type II error. Although this strategy will be at the expense of the type II error, simulation studies have shown that using a bootstrap technique with a relatively stringent significance level (≥ 80%) for sequences of reasonable length (≥ 500 nt) will result in a relatively low probability of incorrectly clustering controls within a transmission chain [49].
Evolution is mainly a stochastic process and by sampling particular genome regions, we hope to capture enough information to adequately recover evolutionary patterns. The appropriate marker in the HIV genome has been the subject of mainly theoretical discussions. For example the use of the C2-V3 env region has been criticized because of possible convergent evolution, while the pol region might lack sufficient variability [21]. The analysis of a known HIV transmission chain has pointed out that we should mainly consider the information content of the genetic marker, which is a function of both sequence length and sequence variability [8,25,55]. Although the pol fragment is a conserved gene region in the HIV genome, we have obtained relatively long sequences to 'correct' for the amount of information. The pol fragment is a convenient marker since it is used to monitor drug resistance. Clinical laboratory procedures have been optimized to obtain such sequence data, even for different HIV-1 group M subtypes, and local databases are usually maintained. Moreover, a recent comprehensive database investigation has shown that for HIV-1 the pol gene, despite its conservation, contains sufficient information to allow phylogenetic reconstruction of transmission chains [48]. On the other hand, obtaining population sequences of considerable length for the complete HIV-1 subtype spectrum in the more variable env gene, especially with viral RNA as input material, is associated with several technical problems [42]; hence the difference in local control sampling between our pol and env analysis.
Although we have used various methods of phylogenetic reconstruction, a full discussion of phylogenetic inference is beyond the scope of this paper, especially since different methods agreed on the cluster of interest (see [56] for a comprehensive review of phylogenetic methods). In addition, analyses of a known transmission chain indicated that the accuracy of the reconstructed tree topology was more dependent on the amount of genetic information than the phylogenetic reconstruction method [8].
Independently of the results of phylogenetic analysis, there are limitations to the conclusions that can be drawn when testing the hypothesis of multiple transmissions. On the one hand, several complications could arise when patient and victim sequences do not all fall into a well-supported monophyletic cluster. Bootstrap analysis might fail to reveal significant associations when there is no clear a priori hypothesis for a well-defined set of patients [23]. In such cases, the question of transmission should be asked separately for each patient [23]. The local control group might also include the subject who infected the index case or an additional unidentified recipient, situations that can only be resolved by using additional information. On the other hand, when the relevant sequences do constitute a well-supported cluster, as in our case, the conclusions should be limited to the observation that the viral strains in the victims are more closely related to the virus in the suspect than to any control [55]. The possibility of another subject having infected both suspect and victims cannot be excluded and no inference can be made on the direction of transmission. In conclusion, we were able to reject the null hypothesis, but unfortunately, this cannot be easily translated into a statement on the probability of transmission.
Acknowledgements
Sponsorship: This work was supported by the Flemish Fonds voor Wetenschappelijk Onderzoek (FWO G.0288.01); P.L. was supported by the Flemish Institute for Promotion and Innovation through Science and Technology in Flanders (IWT-Vlaanderen).
References
1. Rambaut A, Posada D, Crandall KA, Holmes EC. The causes and consequences of HIV evolution. Nat Rev Genet 2004; 5:52-61.
2. Korber B, Muldoon M, Theiler J, Gao F, Gupta R, Lapedes A, et al. Timing the ancestor of the HIV-1 pandemic strains. Science 2000; 288:1789-1796.
3. Sharp PM, Bailes E, Robertson DL, Gao F, Hahn BH. Origins and evolution of AIDS viruses. Biol Bull 1999; 196:338-342.
4. Holmes EC, Nee S, Rambaut A, Garnett GP, Harvey PH. Revealing the history of infectious disease epidemics through phylogenetic trees. Philos Trans R Soc Lond B Biol Sci 1995; 349:33-40.
5. Lemey P, Pybus OG, Wang B, Saksena NK, Salemi M, Vandamme AM. Tracing the origin and history of the HIV-2 epidemic. Proc Natl Acad Sci USA 2003; 100:6588-6592.
6. Yusim K, Peeters M, Pybus OG, Bhattacharya T, Delaporte E, Mulanga C, et al. Using human immunodeficiency virus type 1 sequences to infer historical features of the acquired immune deficiency syndrome epidemic and human immunodeficiency virus evolution. Philos Trans R Soc Lond B Biol Sci 2001; 356:855-866.
7. Op de Coul EL, Prins M, Cornelissen M, van der Schoot A, Boufassa F, Brettle RP, et al. Using phylogenetic analysis to trace HIV-1 migration among western European injecting drug users seroconverting from 1984 to 1997. AIDS 2001; 15:257-266.
8. Leitner T, Escanilla D, Franzen C, Uhlen M, Albert J. Accurate reconstruction of a known HIV-1 transmission history by phylogenetic tree analysis. Proc Natl Acad Sci USA 1996; 93:10864-10869.
9. Ou CY, Ciesielski CA, Myers G, Bandea CI, Luo CC, Korber BT, et al. Molecular epidemiology of HIV transmission in a dental practice. Science 1992; 256:1165-1171.
10. Zhu T, Wang N, Carr A, Nam DS, Moor-Jankowski R, Cooper DA, et al. Genetic characterization of human immunodeficiency virus type 1 in blood and genital secretions: evidence for viral compartmentalization and selection during sexual transmission. J Virol 1996; 70:3098-3107.
11. Crandall KA, Kelsey CR, Imamichi H, Lane HC, Salzman NP. Parallel evolution of drug resistance in HIV: failure of nonsynonymous/synonymous substitution rate ratio to detect selection. Mol Biol Evol 1999; 16:372-382.
12. Shankarappa R, Margolick JB, Gange SJ, Rodrigo AG, Upchurch D, Farzadegan H, et al. Consistent viral evolutionary changes associated with the progression of human immunodeficiency virus type 1 infection. J Virol 1999; 73:10489-10502.
13. Nickle DC, Jensen MA, Shriner D, Brodie SJ, Frenkel LM, Mittler JE, et al. Evolutionary indicators of human immunodeficiency virus type 1 reservoirs and compartments. J Virol 2003; 77:5540-5546.
14. Leigh Brown A. Methods of evolutionary analysis of viral sequences. In: Morse SS, editor. The Evolutionary Biology of Viruses. New York: Raven Press; 1994. pp. 75-84.
15. Drummond AJ, Pybus OG, Rambaut A, Forsberg R, Rodrigo AG. Measurably evolving populations. Trends Ecol Evol 2003; 18:481-488.
16. Drummond AJ, Nicholls GK, Rodrigo AG, Solomon W. Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. Genetics 2002; 161:1307-1320.
17. Posada D, Crandall KA, Hillis DM. Phylogenetics of HIV. In: Rodrigo AG, Learn GH, editors. Computational and Evolutionary Analysis of HIV Molecular Sequences. Dordrecht, The Netherlands: Kluwer Academic Publishers; 2000. pp. 121-160.
18. Abele LG, DeBry RW. Florida dentist case: research affiliation and ethics. Science 1992; 255:903.
19. DeBry RW, Abele LG, Weiss SH, Hill MD, Bouzas M, Lorenzo E, et al. Dental HIV transmission? Nature 1993; 361:691.
20. Smith TF, Waterman MS. The continuing case of the Florida dentist. Science 1992; 256:1155-1156.
21. Holmes EC, Brown AJ, Simmonds P. Sequence data as evidence. Nature 1993; 364:766.
22. Crandall KA. Intraspecific phylogenetics: support for dental transmission of human immunodeficiency virus. J Virol 1995; 69:2351-2356.
23. Hillis DM, Huelsenbeck JP. Support for dental HIV transmission. Nature 1994; 369:24-25.
24. Albert J, Wahlberg J, Leitner T, Escanilla D, Uhlen M. Analysis of a rape case by direct sequencing of the human immunodeficiency virus type 1 pol and gag genes. J Virol 1994; 68:5918-5924.
25. Leitner T, Albert J. Reconstruction of HIV-1 transmission chains for forensic purposes. AIDS Reviews 2000; 2:241-251.
26. Yirrell DL, Robertson P, Goldberg DJ, McMenamin J, Cameron S, Leigh Brown AJ. Molecular investigation into outbreak of HIV in a Scottish prison. BMJ 1997; 314:1446-1450.
27. Goujon CP, Schneider VM, Grofti J, Montigny J, Jeantils V, Astagneau P, et al. Phylogenetic analyses indicate an atypical nurse-to-patient transmission of human immunodeficiency virus type 1. J Virol 2000; 74:2525-2532.
28. Machuca R, Jorgensen LB, Theilade P, Nielsen C. Molecular investigation of transmission of human immunodeficiency virus type 1 in a criminal case. Clin Diagn Lab Immunol 2001; 8:884-890.
29. Birch CJ, McCaw RF, Bulach DM, Revill PA, Carter JT, Tomnay J, et al. Molecular analysis of human immunodeficiency virus strains associated with a case of criminal transmission of the virus. J Infect Dis 2000; 182:941-944.
30. Metzker ML, Mindell DP, Liu XM, Ptak RG, Gibbs RA, Hillis DM. Molecular evidence of HIV-1 transmission in a criminal case. Proc Natl Acad Sci USA 2002; 99:14292-14297.
31. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl Acids Res 1997; 25:3389-3402.
32. Vandamme AM, Witvrouw M, Pannecouque C, Balzarini J, Van Laethem K, Schmit JC, et al. Evaluating clinical isolates for their phenotypic and genotypic resistance against anti-HIV drugs. In: Kinchington D, Schinazi RF, editors. Antiviral Methods and Protocols. Totowa, New Jersey: Humana Press, Inc; 2000. pp. 223-258.
33. Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucl Acids Res 1994; 22:4673-4680.
34. Posada D, Crandall KA. MODELTEST: testing the model of DNA substitution. Bioinformatics 1998; 14:817-818.
35. Swofford DL. PAUP* 4.0 - Phylogenetic Analysis Using Parsimony (*and Other Methods). Sunderland, Massachusetts: Sinauer Associates; 1998.
36. Lemmon AR, Milinkovitch MC. The metapopulation genetic algorithm: An efficient solution for the problem of large phylogeny estimation. Proc Natl Acad Sci USA 2002; 99:10516-10521.
37. Lemmon AR, Milinkovitch MC. MetaPIGA (Phylogeny Inference using the MetaGA) version 1.0.2b. Distributed by the authors; 2002.
38. Guindon S, Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 2003; 52:696-704.
39. Huelsenbeck JP, Ronquist F. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 2001; 17:754-755.
40. Templeton AR, Crandall KA, Sing CF. A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram estimation. Genetics 1992; 132:619-633.
41. Clement M, Posada D, Crandall KA. TCS: a computer program to estimate gene genealogies. Mol Ecol 2000; 9:1657-1659.
42. Van Laethem K, Schrooten Y, Lemey P, Van Wijngaerden E, De Wit S, Van Ranst M, et al. A genotypic resistance assay for the detection of drug resistance in the human immunodeficiency virus type 1 envelope gene. J Virol Methods 2005; 123:25-34.
43. Erixon P, Svennblad B, Britton T, Oxelman B. Reliability of Bayesian posterior probabilities and bootstrap frequencies in phylogenetics. Syst Biol 2003; 52:665-673.
44. Douady CJ, Delsuc F, Boucher Y, Doolittle WF, Douzery EJ. Comparison of Bayesian and maximum likelihood bootstrap measures of phylogenetic reliability. Mol Biol Evol 2003; 20:248-254.
45. Suzuki Y, Glazko GV, Nei M. Overcredibility of molecular phylogenies obtained by Bayesian phylogenetics. Proc Natl Acad Sci USA 2002; 99:16138-16143.
46. Shafer RW, Dupnik K, Winters MA, Eschleman SH. A guide to HIV-1 reverse transcriptase and protease sequencing for drug resistance studies. In: Sodroski J, editor. HIV Sequence Compendium. Los Alamos: Los Alamos National Laboratory; 2000. pp. 1-51.
47. Johnson VA, Brun-Vezinet F, Clotet B, Conway B, D'Aquila RT, Demeter LM, et al. Drug resistance mutations in HIV-1. Top HIV Med 2003; 11:215-221.
48. Hue S, Clewley JP, Cane PA, Pillay D. HIV-1 pol gene variation is sufficient for reconstruction of transmissions in the era of antiretroviral therapy. AIDS 2004; 18:719-728.
49. Krushkal J, Wen-Hsiung L. Use of phylogenetic inference to test an HIV transmission hypothesis. In: Crandall KA, editor. The Evolution of HIV. Baltimore: Johns Hopkins University Press; 1999. pp. 208-232.
50. Salminen M, Nykanen A, Brummer-Korvenkontio H, Kantanen ML, Liitsola K, Leinikki P. Molecular epidemiology of HIV-1 based on phylogenetic analysis of in vivo gag p7/p9 direct sequences. Virology 1993; 195:185-194.
51. Snoeck J, Van Dooren S, Van Laethem K, Derdelinckx I, Van Wijngaerden E, De Clercq E, et al. Prevalence and origin of HIV-1 group M subtypes among patients attending a Belgian hospital in. Virus Res 2002; 85:95-107.
52. Parry JV, Murphy G, Barlow KL, Lewis K, Rogers PA, Belda FJ, et al. National surveillance of HIV-1 subtypes for England and Wales: design, methods, and initial findings. J Acquir Immune Defic Syndr 2001; 26:381-388.
53. Simon F, Loussert-Ajaka I, Damond F, Saragosti S, Barin F, Brun-Vezinet F. HIV type 1 diversity in northern Paris, France. AIDS Res Hum Retroviruses 1996; 12:1427-1433.
54. Snoeck J, Van Laethem K, Hermans P, Van Wijngaerden E, Derdelinckx I, Schrooten Y, et al. Rising prevalence of HIV-1 non-B subtypes in Belgium: 1983-2001. J Acquir Immune Defic Syndr 2004; 35:279-285.
55. Leitner T, Fitch W. The phylogenetics of known transmission histories. In: Crandall KA, editor. The Evolution of HIV. Baltimore, Maryland: Johns Hopkins University Press; 1999. pp. 315-345.
56. Swofford D, Olsen GJ, Waddell PJ, Hillis DM. Phylogenetic inference. In: Hillis DM, Moritz C, Mable BK, editors. Molecular Systematics. Sunderland, Massachusetts: Sinauer Assoc; 1996. pp. 407-514.
© 2005 Lippincott Williams & Wilkins, Inc.