- Research article
- Open Access
Inferring nonneutral evolution from contrasting patterns of polymorphisms and divergences in different protein coding regions of enterovirus 71 circulating in Taiwan during 1998-2003
© Wang et al; licensee BioMed Central Ltd. 2010
- Received: 25 August 2010
- Accepted: 25 September 2010
- Published: 25 September 2010
Enterovirus (EV) 71 is one of the common causative agents for hand, foot, and, mouth disease (HFMD). In recent years, the virus caused several outbreaks with high numbers of deaths and severe neurological complications. Despite the importance of these epidemics, several aspects of the evolutionary and epidemiological dynamics, including viral nucleotide variations within and between different outbreaks, rates of change in immune-related structural regions vs. non-structural regions, and forces driving the evolution of EV71, are still not clear.
We sequenced four genomic segments, i.e., the 5' untranslated region (UTR), VP1, 2A, and 3C, of 395 EV71 viral strains collected from 1998 to 2003 in Taiwan. The phylogenies derived from different genomic segments revealed different relationships, indicating frequent sequence recombinations as previously noted. In addition to simple recombinations, exchanges of the P1 domain between different species/genotypes of human enterovirus species (HEV)-A were repeatedly observed. Contrasting patterns of polymorphisms and divergences were found between structural (VP1) and non-structural segments (2A and 3C), i.e., the former was less polymorphic within an outbreak but more divergent between different HEV-A species than the latter two. Our computer simulation demonstrated a significant excess of amino acid replacements in the VP1 region implying its possible role in adaptive evolution. Between different epidemic seasons, we observed high viral diversity in the epidemic peaks followed by severe reductions in diversity. Viruses sampled in successive epidemic seasons were not sister to each other, indicating that the annual outbreaks of EV71 were due to genetically distinct lineages.
Based on observations of accelerated amino acid changes and frequent exchanges of the P1 domain, we propose that positive selection and subsequent frequent domain shuffling are two important mechanisms for generating new genotypes of HEV-A. Our viral dynamics analysis suggested that the importation of EV71 from surrounding areas likely contributes to local EV71 outbreaks.
- Markov Chain Monte Carlo
- Protein Code Region
- Amino Acid Replacement
- Synonymous Site
- High Posterior Density
Since its initial isolation in California in 1969 [1, 2], enterovirus (EV) 71 was identified to have caused outbreaks of severe neurological disease in Australia, Europe, Asia, and the Americas . Several epidemic outbreaks with high mortality rates have occurred in Bulgaria in 1975 with 44 deaths , in Hungary in 1978 with 47 deaths , and in Malaysia in 1997 with at least 31 deaths . In a 1998 epidemic, Taiwan had 405 severe EV71 cases with 78 deaths. In 2000 to 2002, there were dozens of fatal EV71 cases each year in Taiwan [7, 8]. A stage-based management scheme was developed to reduce the number of case fatalities in Taiwan , but most survivors of brainstem encephalitis plus cardiopulmonary failure might have neurologic sequelae and impaired cognition . Clinically, EV71 is capable of causing paralytic disease indistinguishable from poliomyelitis due to the poliovirus. After the poliovirus was nearly eradicated by vaccination, EV71 is now considered one of the most important enteroviruses. Its genome consists of a positive single-stranded RNA molecule of approximately 7400 nucleotides long. The RNA is translated to give a polyprotein which is then proteolytically processed to yield the mature structural P1 domain (encoding the capsid proteins VP4, VP3, VP2, and VP1) and non-structural P2 (non-structural proteins) and P3 domains (non-structural proteins and viral replicase).
There are two major EV71 genotypes (B and C) co-circulating worldwide (the EV71 prototype strain BrCr, isolated in 1969, is the only known example of genotype A). Genotypes B and C are subdivided into sub-genotypes B1~B5 and C1~C5, respectively [11–14]. Despite many studies having focused on the molecular epidemiology of EV71 [11, 15–19], several aspects of the evolutionary and epidemiological dynamics are still not clear. In particular, most studies of molecular epidemiology focused only on a single region, particularly VP1, without exploring the dynamics at the genomic scale. There are no rigorous measurements of viral variations within an outbreak and among different epidemic seasons. In addition, although the evolutionary rate of VP1 was estimated , little is known as to whether there are discrepancies in immune-related structural vs. non-structural regions and in protein-coding vs. non-coding segments of the EV71 genome. Finally, the processes through which new viral genotypes are generated and evolve are largely unknown.
To help resolve these issues, we examined the evolutionary dynamics of EV71 by analyzing 395 viral isolates sampled in 1998 to 2003 in Taiwan. Four genomic regions, including one capsid protein (VP1), two proteases which cleave the viral polypeptide (2A in the P2 domain and 3C in the P3 domain), and the 5'untranslated region (UTR), were sequenced. Phylogenetic relationships of viruses isolated from different outbreaks were analyzed. We investigated the nucleotide diversity and divergence for different genomic segments, which revealed contrasting patterns for different protein coding regions. Based on the above results, we propose a possible scenario of how new EV71 genotypes emerge and evolve. Finally, we estimated the evolutionary rate and population dynamics of EV71.
List of viral strains based on the VP1 region collected from 1998 to 2003
Phylogeny and recombination of EV71
In the VP1 region, EV71-B4 recovered from different years closely agreed with the phylogenetic positions. Lineage 4 was at the base of the B4 clade followed by lineages 3, 2, and 1. A monophyletic origin of genotype C, including lineages 5, 6, and 7 was strongly supported by the NJ method but with only a 50% bootstrap value in the ML method (Figure 2a). These three lineages together with EV71-A and -B4 formed a monophyletic group with a high bootstrap value (99%). CA16 and its related strain, CA16-like, formed a sister group to EV71. In the 2A region (Figure 2b), all above mentioned lineages are still formed but with low bootstrap support. CA16 did not form a sister group to lineages 8 and 9 as shown in Figure 2a. In the 3C region, lineages 5, 6, and 7 were not clustered together as in Figure 2a and 2b. Instead, lineages 6 and 7 were a sister group to CA8 with 99% bootstrap support in both the ML and NJ methods and 73% probability for BI analysis (Figure 2c). Furthermore, CA16 was not a sister group to lineages 8 and 9. The former was clustered with CA4, and the latter pair was clustered at the bottom of the tree with CA2, CA3, CA6, CA10, CA12, and EV71-A (with 99% bootstrap support in both the ML and NJ trees and 73% probability for BI analysis). In the NJ tree constructed using the 5'UTR, lineages 6 and 7 together were a sister group to CA8 (Figure 2d).
Nucleotide diversity and divergence
Polymorphisms in coding and non-coding regions of enterovirus (EV) 71 derived from viruses collected in different years.
p value (vs. VP1)
p value (vs. UTR)
p value (vs. UTR)A
p value (vs. VP1)
Divergences in coding and non-coding regions between different genotypes and species of human enterovirus (HEV)-A
EV71-B4 vs. EV71-C2
EV71-C2 vs. HEV-A
EV71-B4 vs. HEV-A
EV71-B4 vs. EV71-C2
EV71-C2 vs. HEV-A
EV71-B4 vs. HEV-A
EV71-B4 vs. EV71-C2
EV71-C2 vs. HEV-A
EV71-B4 vs. HEV-A
Tests for positive selection
Since Ks was saturated, the Ka/Ks ratio should be interpreted with care. We tested whether different protein coding regions had different selection regimes by HyPhy (see "Methods"). The results of likelihood ratio test (LRT) revealed no sign for codon under positive selection. In addition, compared to VP1, 2A and 3C were under stronger purifying selection (p < 10-3 for VP1 vs. 2A, p < 0.05 for VP1 vs. 3C, and p = 0.91 for 2A vs. 3C). Therefore, higher Ka/Ks ratio in VP1 cannot be solely imputed to the saturation on synonymous sites, which suggests that different evolutionary forces may have dominated different regions of the genome.
Different approaches were applied to test the hypothesis of positive selection in the VP1 region. We first performed site model tests implemented in PAML to search for evidence of positive selection. The results of LRTs for M1a vs. M2a and M7 vs. M8 were both insignificant, suggesting that the evidence of positive selection working on a particular codon was limited. Using model M2a, we estimated that 98% of the codons were subject to negative selection with an average Ka/Ks ratio of 0.01, 2% had evolved under near neutrality with Ka/Ks ≈ 1, and there was no site with Ka/Ks > 1. We also used branch-site model A to test for positive selection on individual sites along specific lineages (foreground branches). Different lineages were sequentially assigned as foreground branches to test against the null model with Ka/Ks = 1. This test also failed to detect a signature of positive selection on specific branches. PAML also implemented a Suzuki and Gojobori-style  method to test the effects of selection at individual sites with changes according to ancestral states reconstructed by the likelihood method. The results suggested that 98% of the sites were under negative selection, and that only 2% had Ka/Ks values of > 1, but they were all insignificant. Similar results were noted in the analysis of the poliovirus surface protein . Therefore, within enterovirus, evidence of positive selection at individual sites was limited.
Because synonymous substitutions were saturated (Ks > 1; Table 3), the standard Ka/Ks analysis might not be a suitable statistical test for positive selection. We, therefore, performed computer simulations. In each of 10,000 simulations, the number of nucleotide changes was drawn from a Poisson distribution until Ks was equal to the average Ks between EV71 and HEV-A (see "Methods"). Ratios of nonsynonymous (N) to synonymous (S) mutations were chosen according to the observed N/S ratios in the polymorphisms, which were 0.10, 0.24, and 0.21 for VP1, 2A, and 3C, respectively (Table 2). The objective of this test is to detect natural selection in the sequences. If the selection intensity is constant over the evolutionary time scale, the ratio of N to S fixations should be equal to the ratio of N to S polymorphisms. The pattern will differ, however, when other forms of selection act on nonsynonymous mutations. Positive selection will increase the divergence relative to polymorphisms at selected sites, whereas negative selection is expected to result in the opposite pattern.
Reduced levels of polymorphism and divergence in the 5'UTR resemble general patterns of protein evolution and indicate that the 5'UTR was either under selection or was subjected to a lower mutation rate than were synonymous sites. A framework, proposed by McDonald and Kreitman (MK), of comparing levels of polymorphism within and divergence between different viruses can distinguish neutrality from selection in the sequences . If reduced levels of polymorphism and divergence in the 5'UTR can be explained by a lower mutation rate, the ratio of polymorphisms to divergences should be similar to that for synonymous sites. Although, the MK test was originally developed to detect selection at classes of different sites that have the same evolutionary history, it can also be generalized to consider other classes of selected sites from different regions, including non-coding regions , even when they are completely unlinked . We then applied the MK test to the 5'UTR of EV71-C2 vs. CA8 and EV71-C2 vs. -C1 using polymorphisms and divergence in the 3C region as a reference, because CA8 was strongly supported as the outgroup of EV71-C1 and -C2 in these two regions (Figure 2c, d) and because Ks values between CA8 and EV71-C2 at 0.73 and between EV71-C1 and -C2 at 0.42 were not saturated.
Polymorphism and divergence in the 5' untranslated region (UTR) and 3C regions of enterovirus (EV) 71-C
CA8 vs. EV71-C2
EV71-C1 vs. -C2
Population demographics and the rate of change
Evolutionary dynamics, represented by the VP1 region, were estimated using an established Bayesian Markov chain Monte Carlo (MCMC) approach that incorporates the month of viral sampling. In the absence of natural selection, the genetic diversity obtained reflects the change in the effective number of infections over time . Nevertheless, a possible role of selection was demonstrated for VP1 of EV71 in a previous section; the plot was interpreted as a measure of the relative genetic diversity. Changing patterns of genetic diversity clearly showed temporal dynamics of EV71 isolated from 1998 to 2003 (Figure 1). Two peaks, from left to right, respectively represent the outbreak from late 1999 to 2000 (lineages 2 and 3) and the epidemic seasons of 2001 to 2003 (lineage 1). Genetic diversities were largely reduced at the end of each epidemic season demonstrating genetic bottlenecks. Because EV71-C2 (lineage 7) was mainly recovered in 1998 which yielded a large uncertainty in the diversity estimation, we did not include lineage 7 in Figure 1.
The time to the most recent ancestor (TMRCA) of different segments of enterovirus 71 derived from this study.
Beginning of the outbreak
Lineage 2 and 3
Rates of change in different genomic regions of enterovirus (EV) 71-B4 derived from this study
Evolutionary rate (changes/site/year)
By linear regression
By Bayesian Markov chain Monte Carlo
Third codon position
2.2 × 10-2
4.6 × 10-3
1.48 × 10-2
(1.43 × 10-2~1.54 × 10-2)
5.7 × 10-3
(4.6 × 10-3~6.7 × 10-3)
3.5 × 10-2
6.8 × 10-3
1.66 × 10-2
(1.54 × 10-2~1.74 × 10-2)
6.7 × 10-3
5.2 × 10-3~8.5 × 10-3
3.2 × 10-2
7.1 × 10-3
1.51 × 10-2
(1.41 × 10-2~1.61 × 10-2)
6.3 × 10-3
4.9 × 10-3~7.7 × 10-3
5.3 × 10-3
5.5 × 10-3
4.1 × 10-3~7.0 × 10-3
Signature of the positive selection of EV71
VP1 was previously demonstrated to have rapidly evolved . The question is whether positive selection needs to be invoked to account for the high rate of amino acid substitutions or whether alternative explanations of past population demographics, relaxation of selective constraints, and elevation of the mutation rate are sufficient. It was argued that accelerated amino acid substitutions can occur if some nonsynonymous mutations are slightly deleterious, and the current effective population size is larger than the long-term effective population size ; this is because slightly deleterious mutations that are currently not segregated in the population may have been fixed in the past. A historical reduction in population size will also have generated an excess of amino acid substitutions in divergence . For viruses producing only acute infections, such as EV71 and influenza, the population size (judged by the number of transmissions) exponentially increases during an outbreak and then experiences a dramatic reduction after the epidemic. The population dynamics shown in Figure 1 clearly demonstrate that EV71 experienced dramatic population size changes between the different epidemic seasons. Nevertheless, while positive selection acts on specific target regions, the influence of population demographics affects all regions of the genome. Smaller Ka and Ka/Ks values found in non-structural regions (2A and 3C) than in the structural region (VP1) (Table 3) demonstrated that accelerated amino acid substitutions in the latter are not a genome-wide phenomenon and, thus, are unlikely to solely be the consequence of population size changes. If relaxation of selective constraints is responsible for the rapid amino acid replacement between different viruses, levels of θa and θa/θs in VP1 should have increased accordingly. This is not expected under positive selection because mutations would have short retention times in the population and hence would contribute little to polymorphisms. Significantly lower θa and θa/θs values in VP1 than in 2A and 3C (Table 2) indicate no sign of relaxation in the former. Finally, comparable levels of θs among the three protein coding regions clearly illustrate no significant differences in their underlying mutation rates. Consequently, the rapid evolution of VP1 cannot be reconciled with past population demographics, relaxation of selection, or differences in mutation rates. The contrasting patterns of polymorphism and divergence in VP1, thus, should best be explained by the hypothesis that positive Darwinian selection is responsible for the high rate of amino acid replacements.
Among the three outer surface proteins, VP1, VP2, and VP3, which comprise the outer surface of the viral capsid, serotype-specific immune epitopes are predominantly located in VP1 . Immunization using a recombinant VP1 protein of EV71 was shown to confer protection against lethal EV71 infection in newborn mice, indicating that VP1 contains important antigenic sites . In addition, VP1 plays an important role in cell attachment . It is, therefore, temping to speculate that VP1 is under strong selection by host immune systems which would result in accelerated amino acid changes between different species of HEV-A.
While our simulation results suggest positive selection in the VP1 region, codon-based LRTs failed to detect such a signal. This inconsistency was probably due to the following reasons. First, simulations demonstrated that LRTs are not robust to frequent recombinations  which is very common in enteroviruses (Figure 3). Nevertheless, recombination does not create contrasting patterns of polymorphisms and divergences in different genomic regions. Second, as shown in Tables 2 and 3, nucleotide sequences were highly diverged between different species and genotypes of HEV-A, but were very similar within each genotype. It was proposed that highly biased trees might harbor too many changes along the long branches and too few changes along the short branches, leading to a lack of information and low power of the detection method . Tee et al. (2010) applied globally sampled VP1 sequences and identified three putative sites under positive selection , suggesting that a more comprehensive sampling may be necessary to identify site under selection.
Couple lines of evidence also suggest that the 5'UTR may be under positive selection. First, after singletons were removed, the results of the MK test in Table 4 suggested an excess of nucleotide substitutions in divergence, implying the action of selection. Second, a higher proportion of singletons in the 5'UTR than in synonymous sites of protein coding regions (Table 2) implied that the effect of selection has kept deleterious mutations at low frequencies. The 5'UTR of EV71 contains six putative stem-loop structures . Stem-loop I is essential for viral RNA synthesis, while stem-loop II-VI comprise the internal ribosome entry site (IRES) necessary for translation initiation [41, 42]. The maintenance of these structure elements is critical for viral viability. Mutations which destroy base pairing within the stem are therefore deleterious  and will be eliminated or kept in low frequency. On the other hand, compensatory mutations which restore base pairing might have to be fixed at once in order to maintain proper secondary structure. In short, the 5'UTR plays important roles for viral RNA synthesis and translation initiation, and the reduced levels of polymorphism and divergence may be a result of selection.
Mechanism generating new genotypes
It was proposed that recombination contributed to the emergence of various species/genotypes of enterovirus . Instead of random recombinations, however, we observed a consistent phenomenon of acquiring the P1 domain from different genotypes or species of HEV-A. It is unreasonable to expect that chance alone would generate such a repeating pattern (Figure 3). Frequent sequence shuffling, i.e., acquiring a piece of a nucleotide sequence from different viruses, may have had important impacts on the evolution of enteroviruses. As shown by the signature of positive selection in the VP1 region, it is possible that acquiring the P1 domain from donors which carry rapid and presumably adaptive changes would cause these recipient viral strains to gain immediate advantage over others. Consequently, based on the current molecular genetic analysis and previous phylogenetic studies [23, 43], we hypothesized that positive selection and frequent domain shuffling are two important mechanisms for generating new types of enteroviruses.
It must be pointed out that the results in Figure 3 do not themselves provide evidence that these genotypes were generated by recombination between particular viral strains. For example, in the analysis of EV71-C1+C2, when we used either EV71-B or EV71-A, separately, or combined them as one reference group, the results were identical. But when EV71-A and -B were both presented as different reference groups, neither of them showed high relatedness with EV71-C1+C2 in the P1 domain. That is because both EV71-A and -B are equal distances from EV71-C1+C2 in that region (Figure 2a). Therefore, it is possible that EV71-C1+C2 was generated by ancestral sequences of CA8 and EV71 or viral strains very close to them.
Origin of local outbreaks
Our conclusion of viral importation is not consistent with previous study which suggested that EV71 outbreaks caused by the same subgenotype in different countries may be the result of different lineages and were less likely to have been caused by transmission between countries . This conclusion was drawn from the analysis of VP1 sequences at global scale, nevertheless. It is possible that different lineages are indeed maintained in a large area with occasional movement of viral lineages between continents, while within each area, e.g. east Asia, transmissions between borders may be still common .
Evolutionary rate estimation
Our estimated rates of change for synonymous sites of 2A and 3C were close to that of the poliovirus (3.96 × 10-2/site/year) , but higher than estimations by Hanada et al. (1.00 × 10-2), and Brown et al. (1.35 × 10-2) of enterovirus. This discrepancy is reconcilable by distinguishing the mutation rate, which is related to the tempo of change at the population level, from the substitution rate, which reflects long-term evolutionary change. When mutations are selectively neutral, or nearly so, the rate at which they are generated (the mutation rate) should have a simple relation with the rate of fixation (the substitution rate) . Deviation from this neutral expectation can reveal fundamental aspects of evolutionary and biological processes [32, 45]. Under negative selection, mildly deleterious mutations can linger in populations, contributing to polymorphisms, but have little chance of fixation . Therefore, the rate of evolutionary change based on population-level estimates would be larger than those based on long-term follow-up studies, as the former includes mutations doomed by elimination. In our simulations, after singletons were removed, the difference in N/S ratios between polymorphisms and divergences became insignificant (region 2A, Figure 4b), indicating the existence of low-frequency deleterious mutations in polymorphisms. The rates of change for the poliovirus  and EV71 in the current study were derived in a short period of time and are expected to be close to population-level estimates. It is then not unreasonable that these values are larger than those of previous long-term studies [11, 32].
The above scenario can also explain the seemingly counterintuitive observation that the rate of change for the VP1 capsid protein was slower than those of 2A and 3C which have only internal functions (Table 6). It is generally expected that segments expressed on the surface of a virion will have higher rates of evolutionary change than those that only have internal functions, because the former have roles in immune evasion. We showed that VP1 has experienced strong positive selection which increased its divergence and decreased polymorphisms. As a result, the rate of change derived from the polymorphic data should be slower for VP1, because strong selection makes the population less polymorphic (Table 2).
Lack of recombination in the current viral isolates
Only one recombination event identified in 401 viral isolates seems inconsistent with the general notion that enteroviruses frequently recombine. Nevertheless, except for the year 2000, there was only one dominant lineage in each epidemic season. The chance of recombination between different lineages was, thus, reduced. In addition, levels of nucleotide variations within lineages were < 1%. Most methods for detecting recombinations perform poorly at this level of divergence . It is, then, not surprising that only one recombined viral strain was identified. It is worth pointing out that even though there might be a small chance of unidentified recombinants, our conclusions might not or only minimally be affected for a couple of reasons. First, recombination within a lineage does not change the numbers of synonymous and nonsynonymous mutations or the N/S ratio in a circulating viral population, and, thus, will not influence the observed nucleotide diversities and simulation results. Second, in the analysis of the TMRCA and mutation rate, BEAST assumes no recombination and accounts for deviations from this in the data with multiple mutations. As a result, it may overestimate genetic diversities and mutation rates (Table 5) with the existence of recombinations . While the effect of recombinations on TMRCA estimations is still being debated [49–51], TMRCAs from different regions were in generally agreement with each other. Thus, unless different genomic regions have similar patterns of recombination, the existence of unidentified recombinations, if any, should not largely affect our conclusions.
Contrasting patterns of polymorphisms and divergences between different protein coding regions suggest that different evolutionary forces have been dominant in different regions of the EV71 genome. The results of a computer simulation demonstrated a significant excess of amino acid replacements in the VP1 region of local viral isolates which implies a possible role of adaptive evolution. Among different species of HEV-A, exchange of the P1 domain was frequently observed. We propose that positive selection followed by domain shuffling contributed to the emergence of new HEV-A genotypes. Consistent patterns of an increase in viral diversity in the epidemic peak followed by severe diversity reduction were observed in different epidemic seasons. In addition, the viruses sampled in local successive epidemic seasons were not sister to each other, indicating that annual outbreaks of EV71 were due to genetically distinct lineages. It is likely that the importation of EV71 from surrounding areas contributed to local outbreaks.
Viral isolation, reverse-transcription polymerase chain reaction (RT-PCR), and sequencing
Throat swabs, rectal swabs, or cerebrospinal fluid (CSF) were obtained from patients who had clinical manifestations of HFMD or herpangina, viral exanthema, viral meningitis, encephalitis, polio-like syndrome, or cardiopulmonary failure with CNS involvement at Chang Gung Children's Hospital, Taoyuan, Taiwan. Viral isolation was performed by inoculating human embryonic fibroblast (MRC-5), LLC-MK2, HEp-2, and RD cell cultures with the above specimens. Once the enteroviral cytopathic involvement exceeded 50% of the cell monolayer, cells were scraped off and subjected to indirect fluorescent antibody staining with a panenteroviral antibody (catalog no. 3360, Chemicon International, Temecula, CA) to confirm the enterovirus. These isolates were subsequently identified as EV71 by an immunofluorescence assay if they had simultaneously positive immunofluorescence reactions to both EV71-specific antibodies (catalog nos. 3323 and 3324, Chemicon International). EV71 isolates were found in 395 EV71 patients from 1998 to 2003. Six cases were identified as EV71 by the immunofluorescence assay which turned out to be the coxsackievirus A16-like virus based on their VP1 sequences (see "Results"). Therefore, 401 isolates were collected and processed for this study. The demographic data, clinical severity, the year of isolation, and the site of the specimens are listed in Additional file 1, Table S1.
EV71 isolates were stored at -80°C and re-cultured in rhadomyosarcome (RD) cells. Viral RNA was extracted from RD cells using the QIAamp Viral RNA Mini Kit (Qiagen, Valencia, CA). A one-step RT-PCR was performed using the Titan One Tube RT-PCR System (Roche Diagnostics, Indianapolis, IN). Additional file 4, Table S2, lists the primers and PCR conditions for different genomic regions of EV71, including the 5'UTR, VP1, 2A, and 3C. After amplification of the viral RNA, the PCR products were purified using a High Pure PCR Product Purification Kit (Roche Diagnostics, Indianapolis, IN). Cycle sequencing was performed using an ABI Prism BigDye® Terminator cycle sequencing kit and ABI Prism 3730 DNA sequencer (Applied Biosystems, Foster City, CA).
Sequences were aligned using ClustalW  and edited manually. Watterson's estimator of nucleotide diversity (θ) was calculated using the custom C++ software based on the libsequence package . For protein coding regions, mutations were classified as either synonymous or nonsynonymous, and from these we calculated the nucleotide diversity for synonymous (θs) and nonsynonymous (θa) sites. Numbers of synonymous substitutions per synonymous site (Ks) and numbers of nonsynonymous substitutions per nonsynonymous site (Ka) were calculated according to the modified Nei and Gojobori method [54, 55]. This method has two advantages. First, when sequence divergence is high, the number of inapplicable cases is much lower than that with other methods [56–58]. Second, its calculations take the transitional/transversional (TS/TV) mutation bias into account. Because of a high viral mutation rate, nucleotide substitutions in synonymous sites were saturated (Ks > 1, see "Results"). In addition, mutations in EV71 were highly biased toward transitions. In the Ka and Ks calculations, TS/TV ratios of different genomic segments were determined based on the results of model selection (see below).
Recombination and phylogenetic reconstruction
Three methods, including the method of Sawyer  implemented in GENECONV 1.81a http://www.math.wustl.edu/~sawyer, the homoplasy test  implemented in START , and GARD (Genetic Algorithm Recombination Detection)  available online http://datamonkey.org were applied to detect possible recombination events between and within EV71 genotypes. The first is suitable for 5%~20% nucleotide differences which was the range of divergence between different genotypes of EV71, and the second is particularly useful for < 5% differences, which was the level of divergence within genotypes. The numbers of permutations were 10,000 for GENECONV and 1000 for the homoplasy test. The significance level used was 5%. To visualize the recombination break points, the bootscan program implemented in SIMPLOT  was applied. We used a sliding window size of 200 nucleotides with a step size of 20 nucleotides. TS/TV ratios were estimated from the sequences. NJ trees were estimated using F84 distances, and bootstrap support values were obtained from 500 replicates. Bootstrap values were then plotted along the position of the genome. A recombination event was suggested when high levels of phylogenetic relatedness (> 70% bootstrap values) between the query group and more than one reference group in different genomic regions were observed. Bootscan was further applied to detect possible recombinations between different species of HEV-A. All settings were the same, except that a sliding window of 400 nucleotides was used.
The nucleotide substitution model for phylogenetic reconstruction was determined by the Akaike information criterion (AIC)  using Modeltest 3.06  and is listed in the online Additional file 5. The neighbor-joining (NJ) tree was constructed using PAUP* vers. 4.0b10. The maximum-likelihood (ML) analysis was conducted using PhyML 3.0 online  with the starting tree derived from the NJ method, and the NNI topology search option was used. Bayesian inference (BI) was performed using MrBayes vers. 3.1.2 . Analyses were initiated with random starting trees, and Metropolis-coupled Markov chain Monte Carlo (MCMC) analyses were run for 4 × 106 generations and sampled every 100 generations. The first 4000 trees were excluded while the remaining 36000 trees were retained to compute a 50% majority-rule consensus tree. Human enterovirus (HEV)-A, including coxsackievirus A2 (CA2), CA3, CA4, CA5, CA6, CA7, CA8, CA10, CA12, CA14, and CA16, and the prototype EV71 BrCr-CA-70 (EV71- A) were also included in the tree construction with enterovirus 90, AY773285 and AB192877, as the outgroups. Bootstrap replications for nodal support were 500 for the ML and 1000 for the NJ analyses. To further trace the origin of the viruses, we included and aligned EV71 sequences from surrounding areas in the 1990 s to 2003 [11, 13, 15, 17, 18]. Phylogenetic relationships of these sequences were constructed using the NJ method with 1000 bootstrap replications.
To test whether nucleotide substitutions in different genomic regions followed the neutral mutation hypothesis, i.e., variations within populations and divergences between species were both mainly due to neutral or nearly neutral mutations, we carried out the following computer simulations. (1) Using the consensus sequence of EV71-B4 or -C2 as a starting point, the position of change in the sequence was randomly chosen. (2) For different protein coding regions, the change was then determined to be either synonymous (S) or nonsynonymous (N) according to the N/S ratio of the polymorphism. Stop codon changes were not counted. (3) The change was then assigned as a transition or transversion based on the results of the model selection (see "Additional file 5"). (4) The process was repeated until the Ks value equaled the average Ks value between EV71 and HEV-A. (5) The simulation was repeated 10,000 times. The results were tallied for each replicate, and distributions were used to determine the p value for the observed Ka and Ka/Ks values.
Codon-based maximum likelihood ratio test for positive selection
To detect the signal of positive selection, two approaches were applied. Natural selection can be detected by comparing Ks with Ka. Ka > Ks indicates positive selection, whereas Ka < Ks indicates negative selection. Several methods were developed to test for positive selection that acts on given branches (branch method), on a subset of sites (site method), or at individual sites along specific branches (branch-site method) [66–68]. To identify the putative amino acid residues under positive selection and to estimate the strength of the selection, two pairs of likelihood-based models implemented in the PAML  package was used. These models allow the Ka/Ks ratio to vary among sites. The first pair include M1a (nearly neutral) and M2a (positive selection) , while the second pair includes M7 (beta) and M8 (beta&ω) . We also used branch-site test 2 to detect positive selection of the lineages of interest . To test whether different protein coding regions are under similar selection regimes, we used dNdSDistributionComparison.bf implemented in HyPhy 2.0 . This program divides codons into four categories, two negatively selected (Ka < Ks), one neutral (Ka = Ks), and one positively selected (Ka > Ks), and tests whether different genes have similar (1) strengths of positive selection, (2) proportions of sites under positive selection, and (3) distributions for all sites.
Estimates of population dynamics and evolutionary rates
The evolutionary dynamics of EV71 were estimated using an established Bayesian Markov chain Monte Carlo (MCMC) approach [72, 73] that incorporates the exact month of viral sampling. This approach estimates the rate of nucleotide changes and the dynamics of population genetic diversity through time . In addition, the time to the most recent common ancestor (TMRCA) of viruses isolated from each epidemic season was also estimated. Because EV71-C2 was mainly recovered from the 1998 epidemic season, the mutation rate and population dynamic estimated were primarily based on EV71-B4 collected from 1998 to 2003.
The analyses were performed using the SRD06  model of nucleotide substitution by BEAST v1.4.8. Both strict and relaxed (uncorrelated lognormal) molecular clock model  were used and the results are essentially the same (Bayes factor = 0.18). For population demographic, three models, including constant size, exponential growth, and Bayesian skyline plot (BSP) , were tested and the BSP performed significantly better than the other two (Bayes factor > 20). In cases where sequences were sampled in the same month with higher than 99% nucleotide identity, four of the sequences were randomly selected. This resulted in a dataset of 145 EV71-B4 and 22 EV71-C2 sequences (listed in the "Additional file 5"). For each genetic segment, we used two independent runs with 2 × 107 MCMC steps, of which the first 10% was discarded as burn-in. The results were compared to confirm that both were sampling the same distribution and then were combined. The log files were checked using TRACER, and the effective sample size for each parameter was found to exceed 200. Since the time span in this study was less than 6 years, in order to test whether the sequences contained a sufficient signal to estimate evolutionary rates and divergence time, two approaches were applied. First, we included 108 and 48 dated B4 and C2 sequences, respectively, from other areas between the 1990 s and 2003 [11, 13, 15, 17, 18] for the TMRCA estimation (listed in the "Additional file 5"). Second, we randomized the sampling dates of our sequences and repeated the BEAST analysis . In the second analysis only Taiwanese sequences were used. The 95% HPDs of real and randomized data were compared to test whether the distributions overlapped.
The evolutionary rate was also estimated for EV71-B4 by a linear regression of the genetic distance from the oldest isolate versus the time of isolation. For each protein coding region, two separate analyses were conducted: one for all nucleotide positions (representative of both synonymous and nonsynonymous changes) and a second one for only synonymous changes. The former was calculated based on the best model suggested by Modeltest and the latter was calculated according to the modified Nei and Gojobori method.
We thank C.-I. Wu of University of Chicago and P.-J. Chen and D.-S. Chen of National Taiwan University for comments and technical support. This work was supported by the National Research Program for Genomic Medicine from the National Science Council, Taiwan (NSC99-2321-B-002-025, NSC98-2314-B-002-008-MY2, NSC96-3112-B-002-015, and NSC95-3112-B-002-025) to LYC; and National Health Research Institute (NHRI), Taiwan (NHRI-EX99-9934BI) to HYW. No conflict of interest exists for any author. The Institutional Review Board of National Taiwan University Hospital approved this clinical study, of which the Unique Protocol identification for this study was 9311700446.
- Brown BA, Pallansch MA: Complete nucleotide sequence of enterovirus 71 is distinct from poliovirus. Virus Res. 1995, 39: 195-205. 10.1016/0168-1702(95)00087-9.View ArticlePubMedGoogle Scholar
- Schmidt NJ, Lennette EH, Ho HH: An apparently new enterovirus isolated from patients with disease of the central nervous system. J Infect Dis. 1974, 129: 304-309.View ArticlePubMedGoogle Scholar
- Palacios G, Oberste MS: Enteroviruses as agents of emerging infectious diseases. J Neurovirol. 2005, 11: 424-433. 10.1080/13550280591002531.View ArticlePubMedGoogle Scholar
- Shindarov LM, Chumakov MP, Voroshilova MK, Bojinov S, Vasilenko SM, Iordanov I, Kirov ID, Kamenov E, Leshchinskaya EV, Mitov G, et al: Epidemiological, clinical, and pathomorphological characteristics of epidemic poliomyelitis-like disease caused by enterovirus 71. J Hyg Epidemiol Microbiol Immunol. 1979, 23: 284-295.PubMedGoogle Scholar
- Nagy G, Takatsy S, Kukan E, Mihaly I, Domok I: Virological diagnosis of enterovirus type 71 infections: experiences gained during an epidemic of acute CNS diseases in Hungary in 1978. Arch Virol. 1982, 71: 217-227. 10.1007/BF01314873.View ArticlePubMedGoogle Scholar
- Chan LG, Parashar UD, Lye MS, Ong FG, Zaki SR, Alexander JP, Ho KK, Han LL, Pallansch MA, Suleiman AB, et al: Deaths of children during an outbreak of hand, foot, and mouth disease in Sarawak, Malaysia: clinical and pathological characteristics of the disease. Clin Infect Dis. 2000, 31: 678-683. 10.1086/314032.View ArticlePubMedGoogle Scholar
- Ho M, Chen ER, Hsu KH, Twu SJ, Chen KT, Tsai SF, Wang JR, Shih SR: An epidemic of enterovirus 71 infection in Taiwan. Taiwan Enterovirus Epidemic Working Group. N Engl J Med. 1999, 341: 929-935. 10.1056/NEJM199909233411301.View ArticlePubMedGoogle Scholar
- Lin TY, Twu SJ, Ho MS, Chang LY, Lee CY: Enterovirus 71 outbreaks, Taiwan: occurrence and recognition. Emerg Infect Dis. 2003, 9: 291-293.PubMed CentralView ArticlePubMedGoogle Scholar
- Lin TY, Hsia SH, Huang YC, Wu CT, Chang LY: Proinflammatory cytokine reactions in enterovirus 71 infections of the central nervous system. Clin Infect Dis. 2003, 36: 269-274. 10.1086/345905.View ArticlePubMedGoogle Scholar
- Chang LY, Huang LM, Gau SS, Wu YY, Hsia SH, Fan TY, Lin KL, Huang YC, Lu CY, Lin TY: Neurodevelopment and cognition in children after enterovirus 71 infection. N Engl J Med. 2007, 356: 1226-1234. 10.1056/NEJMoa065954.View ArticlePubMedGoogle Scholar
- Brown BA, Oberste MS, Alexander JP, Kennett ML, Pallansch MA: Molecular epidemiology and evolution of enterovirus 71 strains isolated from 1970 to 1998. J Virol. 1999, 73: 9969-9975.PubMed CentralPubMedGoogle Scholar
- Cardosa MJ, Perera D, Brown BA, Cheon D, Chan HM, Chan KP, Cho H, McMinn P: Molecular epidemiology of human enterovirus 71 strains and recent outbreaks in the Asia-Pacific region: comparative analysis of the VP1 and VP4 genes. Emerg Infect Dis. 2003, 9: 461-468.View ArticlePubMedGoogle Scholar
- Mizuta K, Abiko C, Murata T, Matsuzaki Y, Itagaki T, Sanjoh K, Sakamoto M, Hongo S, Murayama S, Hayasaka K: Frequent importation of enterovirus 71 from surrounding countries into the local community of Yamagata, Japan, between 1998 and 2003. J Clin Microbiol. 2005, 43: 6171-6175. 10.1128/JCM.43.12.6171-6175.2005.PubMed CentralView ArticlePubMedGoogle Scholar
- Huang YP, Lin TL, Kuo CY, Lin MW, Yao CY, Liao HW, Hsu LC, Yang CF, Yang JY, Chen PJ, Wu HS: The circulation of subgenogroups B5 and C5 of enterovirus 71 in Taiwan from 2006 to 2007. Virus Res. 2008, 137: 206-212. 10.1016/j.virusres.2008.07.015.View ArticlePubMedGoogle Scholar
- McMinn P, Lindsay K, Perera D, Chan HM, Chan KP, Cardosa MJ: Phylogenetic analysis of enterovirus 71 strains isolated during linked epidemics in Malaysia, Singapore, and Western Australia. J Virol. 2001, 75: 7732-7738. 10.1128/JVI.75.16.7732-7738.2001.PubMed CentralView ArticlePubMedGoogle Scholar
- Bible JM, Pantelidis P, Chan PK, Tong CY: Genetic evolution of enterovirus 71: epidemiological and pathological implications. Rev Med Virol. 2007, 17: 371-379. 10.1002/rmv.538.View ArticlePubMedGoogle Scholar
- Herrero LJ, Lee CS, Hurrelbrink RJ, Chua BH, Chua KB, McMinn PC: Molecular epidemiology of enterovirus 71 in peninsular Malaysia, 1997-2000. Arch Virol. 2003, 148: 1369-1385. 10.1007/s00705-003-0100-2.View ArticlePubMedGoogle Scholar
- Sanders SA, Herrero LJ, McPhie K, Chow SS, Craig ME, Dwyer DE, Rawlinson W, McMinn PC: Molecular epidemiology of enterovirus 71 over two decades in an Australian urban community. Arch Virol. 2006, 151: 1003-1013. 10.1007/s00705-005-0684-9.View ArticlePubMedGoogle Scholar
- Lin KH, Hwang KP, Ke GM, Wang CF, Ke LY, Hsu YT, Tung YC, Chu PY, Chen BH, Chen HL, et al: Evolution of EV71 genogroup in Taiwan from 1998 to 2005: an emerging of subgenogroup C4 of EV71. J Med Virol. 2006, 78: 254-262. 10.1002/jmv.20534.View ArticlePubMedGoogle Scholar
- Maynard Smith J, Smith NH: Detecting recombination from gene trees. Mol Biol Evol. 1998, 15: 590-599.View ArticlePubMedGoogle Scholar
- Sawyer S: Statistical tests for detecting gene conversion. Mol Biol Evol. 1989, 6: 526-538.PubMedGoogle Scholar
- Kosakovsky Pond SL, Posada D, Gravenor MB, Woelk CH, Frost SD: Automated phylogenetic detection of recombination using a genetic algorithm. Mol Biol Evol. 2006, 23: 1891-1901. 10.1093/molbev/msl051.View ArticlePubMedGoogle Scholar
- Simmonds P, Welch J: Frequency and dynamics of recombination within different species of human enteroviruses. J Virol. 2006, 80: 483-493. 10.1128/JVI.80.1.483-493.2006.PubMed CentralView ArticlePubMedGoogle Scholar
- Suzuki Y, Gojobori T: A method for detecting positive selection at single amino acid sites. Mol Biol Evol. 1999, 16: 1315-1328.View ArticlePubMedGoogle Scholar
- Suzuki Y: Negative selection on neutralization epitopes of poliovirus surface proteins: implications for prediction of candidate epitopes for immunization. Gene. 2004, 328: 127-133. 10.1016/j.gene.2003.11.020.View ArticlePubMedGoogle Scholar
- Andolfatto P: Adaptive evolution of non-coding DNA in Drosophila. Nature. 2005, 437: 1149-1152. 10.1038/nature04107.View ArticlePubMedGoogle Scholar
- McDonald JH, Kreitman M: Adaptive protein evolution at the Adh locus in Drosophila. Nature. 1991, 351: 652-654. 10.1038/351652a0.View ArticlePubMedGoogle Scholar
- Sawyer SA, Hartl DL: Population genetics of polymorphism and divergence. Genetics. 1992, 132: 1161-1176.PubMed CentralPubMedGoogle Scholar
- Fay JC, Wyckoff GJ, Wu CI: Testing the neutral theory of molecular evolution with genomic data from Drosophila. Nature. 2002, 415: 1024-1026. 10.1038/4151024a.View ArticlePubMedGoogle Scholar
- Rambaut A, Pybus OG, Nelson MI, Viboud C, Taubenberger JK, Holmes EC: The genomic and epidemiological dynamics of human influenza A virus. Nature. 2008, 453: 615-619. 10.1038/nature06945.PubMed CentralView ArticlePubMedGoogle Scholar
- Gavrilin GV, Cherkasova EA, Lipskaya GY, Kew OM, Agol VI: Evolution of circulating wild poliovirus and of vaccine-derived poliovirus in an immunodeficient patient: a unifying model. J Virol. 2000, 74: 7381-7390. 10.1128/JVI.74.16.7381-7390.2000.PubMed CentralView ArticlePubMedGoogle Scholar
- Hanada K, Suzuki Y, Gojobori T: A large variation in the rates of synonymous substitution for RNA viruses and its relationship to a diversity of viral infection and transmission modes. Mol Biol Evol. 2004, 21: 1074-1080. 10.1093/molbev/msh109.View ArticlePubMedGoogle Scholar
- Eyre-Walker A: Changing effective population size and the McDonald-Kreitman test. Genetics. 2002, 162: 2017-2024.PubMed CentralPubMedGoogle Scholar
- Melnick JL: Enteroviruses: polioviruses, coxsackieviruses, echoviruses, and newer enteroviruses. Virology. Edited by: Field BN, Knipe DM, Howley MP. 1995, New York: Raven Press, 655-712. 3Google Scholar
- Wu CN, Lin YC, Fann C, Liao NS, Shih SR, Ho MS: Protection against lethal enterovirus 71 infection in newborn mice by passive immunization with subunit VP1 vaccines and inactivated virus. Vaccine. 2001, 20: 895-904. 10.1016/S0264-410X(01)00385-1.View ArticlePubMedGoogle Scholar
- Chang L-Y, Shih S-R, Huang L-M, Lin T-Y: Enterovirus 71 Encephalitis. New and Evolving Infections of the 21st Century. Edited by: Fong IW, Alibeck K. 2006, New York: Springer, 295-235.Google Scholar
- Anisimova M, Nielsen R, Yang Z: Effect of recombination on the accuracy of the likelihood method for detecting positive selection at amino acid sites. Genetics. 2003, 164: 1229-1236.PubMed CentralPubMedGoogle Scholar
- Anisimova M, Bielawski JP, Yang Z: Accuracy and power of bayes prediction of amino acid sites under positive selection. Mol Biol Evol. 2002, 19: 950-958.View ArticlePubMedGoogle Scholar
- Tee KK, Lam TT, Chan YF, Bible JM, Kamarulzaman A, Tong CY, Takebe Y, Pybus OG: Evolutionary genetics of human enterovirus 71: origin, population dynamics, natural selection, and seasonal periodicity of the VP1 gene. J Virol. 2010, 84: 3339-3350. 10.1128/JVI.01019-09.PubMed CentralView ArticlePubMedGoogle Scholar
- Lin JY, Li ML, Shih SR: Far upstream element binding protein 2 interacts with enterovirus 71 internal ribosomal entry site and negatively regulates viral translation. Nucleic Acids Res. 2009, 37: 47-59. 10.1093/nar/gkn901.PubMed CentralView ArticlePubMedGoogle Scholar
- Bek EJ, McMinn P: Recent advances in research on human enterovirus 71. Future Virol. 2010, 5: 453-468. 10.2217/fvl.10.22.View ArticleGoogle Scholar
- Bedard KM, Semler BL: Regulation of picornavirus gene expression. Microbes Infect. 2004, 6: 702-713. 10.1016/j.micinf.2004.03.001.View ArticlePubMedGoogle Scholar
- Chan Y-F, Sazaly A: Phylogenetic evidence for inter-typic recombination in the emergence of human enterovirus 71 subgenotypes. BMC Microbiol. 2006, 6: 74-10.1186/1471-2180-6-74.View ArticleGoogle Scholar
- Kimura M: The neutral theory of molecular evolution. 1983, New York: Cambridge University PressView ArticleGoogle Scholar
- Duffy S, Shackelton LA, Holmes EC: Rates of evolutionary change in viruses: patterns and determinants. Nat Rev Genet. 2008, 9: 267-276. 10.1038/nrg2323.View ArticlePubMedGoogle Scholar
- Pybus OG, Rambaut A, Belshaw R, Freckleton RP, Drummond AJ, Holmes EC: Phylogenetic evidence for deleterious mutation load in RNA viruses and its contribution to viral evolution. Mol Biol Evol. 2007, 24: 845-852. 10.1093/molbev/msm001.View ArticlePubMedGoogle Scholar
- Posada D, Crandall KA: Evaluation of methods for detecting recombination from DNA sequences: computer simulations. Proc Natl Acad Sci USA. 2001, 98: 13757-13762. 10.1073/pnas.241370698.PubMed CentralView ArticlePubMedGoogle Scholar
- Achaz G, Palmer S, Kearney M, Maldarelli F, Mellors JW, Coffin JM, Wakeley J: A robust measure of HIV-1 population turnover within chronically infected individuals. Mol Biol Evol. 2004, 21: 1902-1912. 10.1093/molbev/msh196.View ArticlePubMedGoogle Scholar
- Worobey M: A novel approach to detecting and measuring recombination: new insights into evolution in viruses, bacteria, and mitochondria. Mol Biol Evol. 2001, 18: 1425-1434.View ArticlePubMedGoogle Scholar
- Schierup MH, Hein J: Consequences of recombination on traditional phylogenetic analysis. Genetics. 2000, 156: 879-891.PubMed CentralPubMedGoogle Scholar
- Worobey M, Holmes EC: Homologous recombination in GB virus C/hepatitis G virus. Mol Biol Evol. 2001, 18: 254-261.View ArticlePubMedGoogle Scholar
- Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22: 4673-4680. 10.1093/nar/22.22.4673.PubMed CentralView ArticlePubMedGoogle Scholar
- Thornton K: Libsequence: a C++ class library for evolutionary genetic analysis. Bioinformatics. 2003, 19: 2325-2327. 10.1093/bioinformatics/btg316.View ArticlePubMedGoogle Scholar
- Nei M, Gojobori T: Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol. 1986, 3: 418-426.PubMedGoogle Scholar
- Zhang J, Rosenberg HF, Nei M: Positive Darwinian selection after gene duplication in primate ribonuclease genes. Proc Natl Acad Sci USA. 1998, 95: 3708-3713. 10.1073/pnas.95.7.3708.PubMed CentralView ArticlePubMedGoogle Scholar
- Li WH: Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J Mol Evol. 1993, 36: 96-99. 10.1007/BF02407308.View ArticlePubMedGoogle Scholar
- Comeron JM: A method for estimating the numbers of synonymous and nonsynonymous substitutions per site. J Mol Evol. 1995, 41: 1152-1159. 10.1007/BF00173196.View ArticlePubMedGoogle Scholar
- Ina Y: New methods for estimating the numbers of synonymous and nonsynonymous substitutions. J Mol Evol. 1995, 40: 190-226. 10.1007/BF00167113.View ArticlePubMedGoogle Scholar
- Jolley KA, Feil EJ, Chan MS, Maiden MC: Sequence type analysis and recombinational tests (START). Bioinformatics. 2001, 17: 1230-1231. 10.1093/bioinformatics/17.12.1230.View ArticlePubMedGoogle Scholar
- Kosakovsky Pond SL, Frost SD: Datamonkey: rapid detection of selective pressure on individual sites of codon alignments. Bioinformatics. 2005, 21: 2531-2533. 10.1093/bioinformatics/bti320.View ArticleGoogle Scholar
- Lole KS, Bollinger RC, Paranjape RS, Gadkari D, Kulkarni SS, Novak NG, Ingersoll R, Sheppard HW, Ray SC: Full-length human immunodeficiency virus type 1 genomes from subtype C-infected seroconverters in India, with evidence of intersubtype recombination. J Virol. 1999, 73: 152-160.PubMed CentralPubMedGoogle Scholar
- Posada D, Buckley TR: Model selection and model averaging in phylogenetics: advantages of Akaike Information Criterion and Bayesian approaches over likelihood ratio tests. Syst Biol. 2004, 53: 793-808. 10.1080/10635150490522304.View ArticlePubMedGoogle Scholar
- Posada D, Crandall KA: MODELTEST: testing the model of DNA substitution. Bioinformatics. 1998, 14: 817-818. 10.1093/bioinformatics/14.9.817.View ArticlePubMedGoogle Scholar
- Guindon S, Gascuel O: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol. 2003, 52: 696-704. 10.1080/10635150390235520.View ArticlePubMedGoogle Scholar
- Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003, 19: 1572-1574. 10.1093/bioinformatics/btg180.View ArticlePubMedGoogle Scholar
- Nielsen R, Yang Z: Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics. 1998, 148: 929-936.PubMed CentralPubMedGoogle Scholar
- Yang Z: PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007, 24: 1586-1591. 10.1093/molbev/msm088.View ArticlePubMedGoogle Scholar
- Zhang J, Nielsen R, Yang Z: Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol. 2005, 22: 2472-2479. 10.1093/molbev/msi237.View ArticlePubMedGoogle Scholar
- Yang Z: PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007, 24: 1586-1591. 10.1093/molbev/msm088.View ArticlePubMedGoogle Scholar
- Yang Z, Wong WS, Nielsen R: Bayes empirical bayes inference of amino acid sites under positive selection. Mol Biol Evol. 2005, 22: 1107-1118. 10.1093/molbev/msi097.View ArticlePubMedGoogle Scholar
- Yang Z, Nielsen R, Goldman N, Pedersen AM: Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics. 2000, 155: 431-449.PubMed CentralPubMedGoogle Scholar
- Drummond AJ, Nicholls GK, Rodrigo AG, Solomon W: Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. Genetics. 2002, 161: 1307-1320.PubMed CentralPubMedGoogle Scholar
- Drummond AJ, Rambaut A: BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol. 2007, 7: 214-10.1186/1471-2148-7-214.PubMed CentralView ArticlePubMedGoogle Scholar
- Drummond AJ, Rambaut A, Shapiro B, Pybus OG: Bayesian coalescent inference of past population dynamics from molecular sequences. Mol Biol Evol. 2005, 22: 1185-1192. 10.1093/molbev/msi103.View ArticlePubMedGoogle Scholar
- Shapiro B, Rambaut A, Drummond AJ: Choosing appropriate substitution models for the phylogenetic analysis of protein-coding sequences. Mol Biol Evol. 2006, 23: 7-9. 10.1093/molbev/msj021.View ArticlePubMedGoogle Scholar
- Drummond AJ, Ho SY, Phillips MJ, Rambaut A: Relaxed phylogenetics and dating with confidence. PLoS Biol. 2006, 4: e88-10.1371/journal.pbio.0040088.PubMed CentralView ArticlePubMedGoogle Scholar
- Ramsden C, Holmes EC, Charleston MA: Hantavirus evolution in relation to its rodent and insectivore hosts: no evidence for codivergence. Mol Biol Evol. 2009, 26: 143-153. 10.1093/molbev/msn234.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.