Varying influences of selection and demography in host-adapted populations of the tick-transmitted bacterium, Anaplasma phagocytophilum
© Aardema and von Loewenich; licensee BioMed Central. 2015
Received: 11 December 2014
Accepted: 18 March 2015
Published: 31 March 2015
The host range of a pathogenic bacterial strain likely influences its effective population size, which in turn affects the efficacy of selection. Transmission between competent hosts may occur more frequently for host generalists than for specialists. This could allow higher bacterial population densities to persist within an ecological community and increase the efficacy of selection in these populations. Conversely, specialist strains may be better adapted to their hosts and consequently achieve greater within-host population densities, with corresponding increases in selection efficacy. To assess these different hypotheses, we examined the effective population sizes of three strains of the bacterium Anaplasma phagocytophilum and categorized the varying roles of selection and demography on patterns of genetic diversity and divergence in these populations. A. phagocytophilum is a tick-transmitted, obligately intracellular pathogen. Strains of A. phagocytophilum display varying degrees of host specialization, making this a good species for exploring questions regarding host range, effective population size and selection efficacy.
We found that a roe deer specialist harbored the most genetic diversity of the three A. phagocytophilum strains and correspondingly had the largest effective population size. Another strain that is ecologically specialized on rodents and insectivores had the smallest effective population size. However, these mammalian hosts are distantly related evolutionarily. The third strain, a host generalist, was intermediate in its effective population size between the other two strains. Evolutionary constraint on non-synonymous sites was pervasive in all three strains, although some slightly deleterious mutations may also be segregating in these populations. We additionally found evidence of genome-wide selective sweeps in the generalist strain, whereas signals of repeated bottlenecks were detected in the strain with the smallest effective population size.
A. phagocytophilum is a diverse bacterial species that differs among distinct strains in its effective population size, as well as how genetic diversity and divergence have been influenced by selection and demographic changes. In this species, host specialization may facilitate increased population growth and allow more opportunities for selection to act. These results provide insights into how host range has influenced evolutionary patterns of strain divergence in an emerging zoonotic bacterium.
Obligately intracellular bacteria typically have smaller effective population sizes than free-living relatives due to constraints placed on them by the cellular space needed for growth, the number of cells capable of being infected and the availability of competent hosts . Among pathogenic bacteria, variation in host range may also influence effective population sizes as the diversity of competent host species available for infection influences both transmission dynamics and disease prevalence in the environment [2,3]. Population densities, connectivity and immune responses can also vary among the different species a pathogen is capable of infecting [4-6]. This may further impact effective population sizes and the levels of genetic diversity observed between strains.
Variation in host-range may also influence intra-specific strain divergence. Small population sizes and limited transmission between hosts can result in strong genetic bottlenecks, which reduce diversity and create the potential for genetic divergence to arise between strains through drift [7-9]. Adaptation to different hosts could also be an important contributor of strain divergence . In addition to producing divergence at the target of selection, such adaptive evolution is often accompanied by a selective sweep, which may create additional genetic divergence between strains . The relative roles of stochastic evolutionary processes and directed, adaptive evolution have not been well categorized in pathogenic bacterial populations, but both are known to make important contributions to evolution in free-living bacteria and other clonal organisms [10-12].
In this study we compare two distinct hypotheses related to host range in A. phagocytophilum. The first hypothesis is that generalist strains will maintain larger effective population sizes than specialist strains. As generalists should have a higher density of hosts to colonize within an ecological community, greater transmission potential between hosts should be higher leading to increased population sizes. Alternatively, specialists may achieve higher within-host population densities due to increased adaptation to their hosts. This could lead to overall higher effective population sizes relative to generalists that are more poorly adapted to any particular host species. In conjunction with these predicted differences between host specialists and generalists, we also postulated that the relative importance of selection and drift in producing evolutionary divergence between strains differs in relation to effective population size.
To test our hypotheses we examined genetic data from 265 individual A. phagocytophilum samples obtained from 17 European mammal species and Ixodes ricinus ticks. These samples cluster into the three distinct strains described above. With this data we examined the amount of standing genetic diversity harbored in each strain and estimated their effective population sizes. We also explored the contributions of selection and drift to the production of divergence between these populations.
Genetic diversity and effective population sizes
Following Huhn and colleagues , we will refer to the generalist strain of A. phagocytophilum as ‘cluster 1’ and the roe deer specialist strain as ‘cluster 2’ (Figure 1). We will refer to the population that infects voles and shrews and is transmitted by a distinct tick vector as the ‘cluster 3’ strain. Using population data from partial sequences of seven housekeeping genes, we estimated two measures of genetic diversity for each strain. The first measure was π, which is the mean pairwise genetic difference between samples [18,19]. The second was θW, which is a measure of genetic diversity based on the number of segregating mutations in a sample .
The mean number of segregating sites and estimates of synonymous and non-synonymous genetic diversity for each cluster across the seven genetic regions
θ w 4
θ w 4
To estimate effective population size we utilized the formula: Ne = θ/2 μ, where Ne is the effective population size, θ is a measure of per locus diversity and μ is the per locus mutation rate [20,21]. A mutation rate is not currently known for A. phagocytophilum. Therefore, to estimate effective population sizes we utilized an average of previously reported mutation rates for other bacterial species (~0.003 per genome ). The A. phagocytophilum genome is approximately 1.4 Mb in length . Therefore we estimated the per locus mutation rate as ~2 × 10−9, with the assumption that it is the same for all three strains. We calculated effective population size using both synonymous measures of genetic diversity (θW and π). The effective population size of the cluster 1 strain was estimated to be between 3.28 × 106 and 4.90 × 106, the cluster 2 strain to be between 8.85 × 106 and 9.48 × 106, and the cluster 3 strain to be between 1.15 × 106 and 1.40 × 106.
It is possible that variation in our sampling efforts could affect estimates of genetic diversity. However, we saw the overall highest mean diversity levels in the cluster 2 strain, which had the smallest sample size (n = 18). Nonetheless, we wanted to determine if the much larger sample size of cluster 1 could have influenced our estimates of genetic diversity in this strain. To test this we randomly drew 20 samples from the full data set with replacement and calculated the four diversity statistics for this subset of the data. We did this 200 times to generate bootstrapped confidence intervals of our estimates. Based on this analysis, the mean synonymous π for cluster 1 was 0.0130 (±0.0075), mean synonymous θW was 0.0163 (±0.0067), mean non-synonymous π was 0.0008 (±0.0005), and mean non-synonymous θW was 0.0013 (±0.0008). None of these results differed significantly from the same diversity estimate based on the full dataset (data not shown).
We quantified the extent of inter-locus linkage disequilibrium (LD) to assess how influential linkage may be on patterns of diversity among these strains. To do this we calculated a variant of the index of association called rD [24,28], using all genetic regions for each cluster. This statistic measures whether two individuals being similar at one locus makes them more likely to be similar at another locus. It ranges from 0 to 1, and a value significantly different from 0 indicates that recombination has been rare and loci may be in LD. For the full datasets, rD was significantly different from 0 in clusters 1 and 3 (Cluster 1: n = 227, rD = 0.126, p < 0.001; Cluster 3: n = 20, rD = 0.256, p = 0.001). Cluster 2 did not have a rD value statistically different from 0 (n = 18, rD = 0.078, p = 0.528). These results suggest that the genetic regions used in our analyses from clusters 1 and 3 are not independent and that the different influences of selection and demography may be obscured. However, as Maynard Smith and colleagues pointed out, within a bacterial population it is common for one or a few genotypes to occasionally arise in a population and rapidly become widespread . Depending on the speed at which this occurs, there may not be sufficient time for recombination to break up linkage groups and population samples may be comprised of representatives from a small number of clones. This may be especially true when sampling efforts are uneven for various geographic regions or hosts. To negate some of this problem, it was suggested that identical samples be collapsed into a single representative sample . When we reduced the dataset for clusters 1 and 3 to only unique samples, we found that rD was no longer significantly different from 0 for cluster 3 (n = 10, rD = 0.176, p = 0.170). However, cluster 1 still showed evidence of LD (n = 10, rD = 0.021, p < 0.001).
Average estimates for tests of selection and additional neutrality tests based on the observed number of segregating sites
d N /d S 1
Fu & Li’s D
Fay & Wu’s H
We used the McDonald-Kreitman test (MK test) to further look for signatures of selection in each locus within the three clusters . Specifically, we compared the observed levels of polymorphism and divergence at synonymous and non-synonymous sites to look for deviations from neutral expectations in any loci. Such deviations could be the result of adaptive processes, or they may indicate the presence of slightly deleterious mutations segregating in the population . Only three genes among any of the strains exhibited significant deviations from neutrality (Additional file 1: Table S2; Cluster 1 (atpA): Fisher’s exact test, p < 0.001; Cluster 2 (pheS): Fisher’s exact test p = 0.026; Cluster 3 (fumC): Fisher’s exact test, p = 0.008.). To determine the direction of these deviations we used a variant of the neutrality index called the direction of selection test (DoS), which corrects for potential biases when the amount of data is small [32,33]. A positive DoS suggests that positive selection has acted on a region, whereas a negative DoS indicates that slightly deleterious alleles may be segregating. One gene from cluster 2 had both a positive DoS value and significantly deviated from neutrality based on the MK test (phes, DoS = 0.23). This indicates that positive selection has likely acted to produce divergence in this gene. The other loci that had significant deviations from neutrality had negative DoS values (Cluster 1 (atpA) DoS = -0.54; Cluster 3 (fumC) DoS = -0.68). Negative DoS values indicate an excess of non-synonymous polymorphism, which can occur when slightly deleterious mutations are circulating in a population.
While we found evidence for both positive selection and segregating deleterious alleles, purifying selection appears to be the primary selective force acting in all three strains. However, differences in average segregating site frequency between strains suggest that selection has not been the only factor influencing genetic diversity levels. To investigate the potential effects of demographic changes within these populations, we examined three complimentary population statistics that compare observed segregating site frequencies to expectations under neutrality. These were: Tajima’s D , Fu & Li’s D  and Fay & Wu’s H . Combining multiple statistics in this fashion gives a more clear picture of the processes acting on a population than any one test alone [37,38].
Tajima’s D compares the number of segregating sites in the population sample to genetic diversity (π). Under neutrality these two numbers should be very similar and D will be approximately 0. When D is greater than 0 it indicates that there is a high level of intermediate frequency polymorphism relative to neutral expectations. Conversely, if D is less than 0 it indicates an excess of low frequency polymorphism relative to neutral expectations. The average value of D was negative for both clusters 1 and 2, but positive for cluster 3 (Table 2). However, these deviations from 0 were not significant for any strain (data not shown). For both clusters 1 and 2, two of the seven loci were significantly negative, suggesting that purifying selection may have acted on these genetic regions (Additional file 1: Table S3).
Fu & Li’s D is similar to Tajima’s D except that it specifically compares the number of mutations observed in just one population member (‘singletons’) to the expected number under neutrality. This makes the test more sensitive to selective sweeps, which are predicted to be a powerful force in bacterial evolution . Fu & Li’s D can also be useful for detecting bottlenecks . The average value of Fu & Li’s D was not significant for any of the three strains (data not shown). Fewer individual loci had significant deviations from the neutral expectation than were observed for Tajima’s D as well, and only one locus in cluster 1 was significant for Fu & Li’s D, but not Tajima’s D (Additional file 1: Table S3).
The third test, Fay & Wu’s H, compares the number of high-frequency derived mutations to those at intermediate frequency. This test was designed specifically to detect a selective sweep, as linked sites should rise in frequency around the target of positive selection, increasing derived allele frequencies. H is less sensitive to population expansion than the other two tests . Both clusters 1 and 3 had an average negative H value significantly different from 0, indicating an excess of high frequency, derived alleles. For cluster 1, four of the seven loci had a significantly negative H value (Additional file 1: Table S3). Two loci out of seven were significantly negative for cluster 2 and two out of six were significant for cluster 3. Overall, a large proportion of H values were negative throughout the three strains, indicating a prevalence of high frequency, derived segregating alleles.
A. phagocytophilum in Europe circulates in multiple, discreet enzootic cycles and consequently distinct populations of the bacterium have been identified [13-15]. One strain infects a wide array of mammalian hosts including humans, livestock and other domestic animals. In contrast, a second strain specializes on roe deer. A third strain infects rodents and insectivores, and differs in the tick vector that facilitates transmission between hosts . Among host-dependent bacteria such as A. phagocytophilum, transmission opportunities between competent hosts may occur less frequently for host specialists . This may act to limit their potential for population growth. A lower density of prospective hosts in the community may also limit effective population sizes. Conversely, specialization may facilitate adaptation to competent hosts and allow greater within-host population densities . This may support larger effective population sizes in host specialists. In accordance with this second hypothesis, the roe deer specialist (cluster 2) had the largest estimated effective population size of the three strains. Specific adaptations for colonizing roe deer may allow this strain to reach higher within-host population densities compared to generalist strains . Additionally, roe deer represent a very large host pool that likely increases A. phagocytophilum rates of encounter and allows high densities of this specialized strain to be maintained within ecological communities [4,39]. Higher A. phagocytophilum prevalence rates in roe deer compared to other hosts suggests that infection may be chronic in these animals or that frequent reinfection may occur. In either case, higher effective population sizes would be achieved.
The cluster 1 strain, which is a host generalist, had a smaller estimated effective population size than the roe deer specialist. It is likely that this strain of A. phagocytophilum is not as well adapted to any particular host. Therefore, it achieves lower within-host population densities than the specialist strain. Lower densities could reduce the rate of transmission between hosts. While increasing the number of species a pathogen can utilize produces more infected individuals throughout an ecological community, for any particular species the proportion of individuals infected will be smaller . This results in less frequent transmission events and fewer opportunities for adaptive evolution to occur within a population.
The cluster 3 strain had a much smaller effective population size than either the cluster 1 or 2 strains. This strain’s primary hosts are insectivores and rodents, which have the highest population densities of any mammal in Europe . However, the cluster 3 strain is predominately transmitted by the nest-living tick, I. trianguliceps, resulting in a distinct zoonotic cycle with minimal overlap in either host or vector with the other A. phagocytophilum strains [13,41]. Furthermore, evidence suggests that both voles and shrews may be able to clear A. phagocytophilum infection as indicated by its absence in winter months when tick vectors are dormant [13,42]. By contrast, roe deer are found to harbor infections year round . It is therefore probable that vector population dynamics play a larger role in limiting the effective population size of this strain than in other strains . Challenges related to host adaptation may also be a factor as insectivores and rodents are highly diverged evolutionarily .
Both demographic events and selection have acted to produce the effective population sizes of these strains. In all three strains, minor alleles at non-synonymous sites were on average segregating at a lower frequency than synonymous minor alleles. This observation suggests that purifying selection has been a strong force acting on non-synonymous variation in these populations [34,43-45]. A predominance of purifying selection in housekeeping genes is typical for pathogenic bacterial species [46,47]. However, we also found evidence for differences in demographic and selection history between the three populations in this study.
The cluster 1 strain harbored a large number of low-frequency variants in each of the genetic regions analyzed. An overall excess of low frequency alleles may be an indication that this population has expanded in size [34,48-50]. It could also indicate that selective sweeps have occurred in this strain, followed by new mutations entering the population. We additionally found an excess of high-frequency derived alleles as indicated by negative Fay and Wu’s H values for all genes. A negative H typically occurs only in incidences of a selective sweep and is a primary way to distinguish sweeps from population expansion . Finally, the cluster 1 strain had a significantly high level of LD between loci, even when identical clones were excluded from the analysis. Selective sweeps are predicted to increase LD, whereas population expansion decreases LD [49,51]. Based on these results, we conclude that selective sweeps in this strain have been important contributors to genetic divergence between this and other A. phagocytophilum populations. These sweeps would have allowed neutral and mildly deleterious alleles to rise in frequency and fix in the genome through genetic hitchhiking.
Only the frequency of non-synonymous mutations differed from the neutral expectation in the cluster 2 strain. We also found that cluster 2 had the most variable diversity levels between genetic regions. Finally, there was no evidence of LD between loci in this strain. Together, these results indicate that neither demographic changes, nor genome-wide selection events, have likely affected patterns of diversity in this strain. Rather, it appears that selection acting locally in the genome has had the greatest influence on strain-level genetic diversity. Several of the loci in this study appear to have been influenced by local selective sweeps as evidenced by negative values for all three demographic statistics. Another locus had the opposite trend, with positive values for all three neutrality tests, indicating that most segregating sites were at intermediate frequencies. This can occur if balancing selection is acting on a region or else there is unrecognized population structure. Interestingly, this was the only locus that was both significant for the MK test and that had a positive DoS. These observations strengthen the hypothesis that balancing selection has acted on this locus or in a region closely linked to it. Ultimately, it appears that a relatively large effective population size and frequent recombination has allowed selection to operate locally in the genome of this strain without affecting genetic diversity more broadly. It also means that stochastic changes in population size are unlikely to have had a major influence on the establishment of divergence between this and other strains. Rather mutation and selection are likely to be the primary drivers of divergence in this A. phagocytophilum population.
The cluster 3 strain had a higher than expected average allele frequency for synonymous segregating sites, suggesting that there is a deficiency of low-frequency segregating alleles in this population. Such a deficiency can arise due to genetic bottlenecks when low-frequency alleles are disproportionally lost as the size of the population decreases . A high proportion of positive Tajima’s D and Fu & Li’s D values among cluster 3 loci also support the conclusion that this population has fewer low frequency segregating alleles than expected, and that it likely experienced one or more bottlenecks. Finally, we see extensive variance in diversity levels between loci in this strain. Increased variance between genetic regions is expected after a population reduction . Hidden population structure and balancing selection can also cause genetic patterns similar to a bottleneck. However, both of these factors should increase overall genetic diversity, whereas the cluster 3 strain was found to have the lowest amount of genetic diversity among the three strains. If one or more bottlenecks have occurred in this population, it is likely that many segregating alleles were fixed by genetic drift. This may have produced extensive divergence between this and other A. phagocytophilum strains. Of the three strains examined in this study, the cluster 3 strain is the most divergent (Figure 1, Table 2). Recent bottlenecks may also have contributed to the smaller effective population size we observed in this strain.
Pairwise divergence estimates for the three clusters based on the concatenated dataset of all seven genetic regions
D xy 2
D a 3
1 vs 2
1 vs 3
2 vs 3
In addition to host range, other factors have undoubtedly influenced genetic diversity and divergence in these populations as well. For example, in A. phagocytophilum regular recombination of p44 surface genes and functional pseudogenes allows populations within a vertebrate host to evade immune responses [27,55]. The p44 expression cassettes and pseudogenes can be found throughout the genome, although there are two regions where the majority of these sites cluster (Figure 3). Therefore, selection from host immune defenses could influence both host adaptation and recombination frequency, which could potentially affect patterns of genetic diversity throughout the genomes of these populations. Other host-specific characteristics could also play a role in limiting the effective population sizes of these strains, as could the population dynamics of transmission vectors. Vector biology may be particularly important in limiting the effective population size of the cluster 3 strain. Unrecognized population structure in these strains may also have contributed to observed patterns of genetic diversity . Additional work will be required to determine what factors have been the most important in influencing genetic diversity and divergence in A. phagocytophilum.
Our analyses reveal that evolutionary processes acting on host-adapted A. phagocytophilum strains have been influenced by their effective population size, which in turn has likely been impacted by the ecology and population densities of competent hosts. It remains to be determined what factors contributed to the initial production of host range differences between these strains, but both vector and host population dynamics have likely played important roles . Specialization alone has not restricted population growth in A. phagocytophilum, but rather may have facilitated relative increases in effective population size. Frequent homologous recombination in some strains, possibly in conjunction with evolving responses to immune defense, has likely reduced the impact of genetic linkage between genome regions and has allowed adaptive processes to occur in these bacteria without impacting genome-wide genetic diversity. However, in other cases bottlenecks have likely reduced genetic diversity and may have restricted adaptation rates. Such population reductions may also have allowed for drift to contribute to divergence between strains.
Pathogens with a broad host range have the greatest probability of being transmitted to humans . This appears to have been the case for A. phagocytophilum where it is the generalist strain that is found to infect people in Europe . Overall, better knowledge of how the life history characteristics of natural hosts influence bacterial population dynamics will provide insights into the maintenance of genetic diversity in emerging zoonotic bacteria. Understanding this diversity will be important for predicting the potential of such bacteria to emerge as prospective zoonotic agents as they evolve in response to ever changing host population dynamics.
For this study, we utilized partial sequences of seven A. phagocytophilum genetic regions totaling 2,877 base pairs. These sequences were isolated from 17 different host mammals and I. ricinus ticks (Additional file 1: Tables S4, S5). Using maximum-likelihood phylogenetic analysis, Huhn and colleagues showed that these samples could be clustered into one of three genetically distinct groups . These likely represent unique populations with independent transmission cycles. We followed these same cluster classifications for the samples in this analysis. From the original dataset, we removed all but one set of sequences in cases where there were multiple temporal samples from a single host. We also removed sequences that were isolated in the United States and all sequences from any sample harboring polymorphic regions in any of the seven loci, as this indicates the host may have been infected with multiple clones. This reduced dataset left us with 227 samples in cluster 1, 18 samples in cluster 2, and 20 samples in cluster 3. To further assess the extent of divergence among these three populations, we calculated three pairwise measures of genetic divergence between the clusters: the fixation index (Fst, [59,60]), the average number of nucleotide substitutions per site between each cluster (Dxy, ) and the net number of nucleotide differences per site between each cluster (Da, ). These were calculated using concatenated datasets across all sites with the program DnaSP . Our results confirmed previous findings that these three strains are highly diverged from one another (Table 3).
For each strain, we calculated two measures of genetic diversity, the average pairwise nucleotide diversity per site (π, [18,19]) and Watterson’s θ (θw), which is based on the number of segregating sites . For both measurements, synonymous and non-synonymous diversity was calculated separately using the program Polymorphorama . For π, calculations were based on the number of mutations when more than two alleles were segregating at a site. To assess if any cluster was statistically different from another for any diversity measure and site class, we used a paired t-test as implemented in the program R .
Additionally, because of its much larger sample size, we examined whether the cluster 1 samples would exhibit similar levels of genetic diversity to the full data set when a smaller set of samples was examined. To do this, we randomly selected 20 of the samples from the full dataset (with replacement) and again calculated the same diversity statistics using Polymorphorama . We repeated this 200 times to determine confidence intervals.
To examine inter-locus recombination, for each strain we calculated rD as implemented in the program MultiLocus (ver. 1.2.2, ). Statistical significance was determined by comparing 1,000 randomized datasets with a null hypothesis of complete linkage equilibrium between loci (rD = 0). rD was calculated for all three clusters using all samples. rD was also calculated for clusters 1 and 3 using reduced datasets where all but one representative of identical clones was removed.
For each population we determined the average frequency of segregating alleles for both synonymous and non-synonymous sites. Allele frequencies were determined using the program Polymorphorama . The expected neutral mean frequency for segregating alleles was calculated based on the sample size and number of observed segregating sites .
For all tests of selection, non-synonymous sites (selected class) were compared to synonymous sites (neutral class). We first compared the number of non-synonymous changes per non-synonymous site to synonymous changes per synonymous site (dN/dS) for each locus . Ratios greater than one suggest that positive selection has acted to generate divergence between populations. Ratios less than one suggest that purifying selection has been the more common selective force, eliminating disadvantageous amino acid substitutions as they arose, but allowing for synonymous changes between populations to fix. To count the number of synonymous and non-synonymous sites as well as divergences, we used the program Polymorphorma . We also performed the McDonald-Kreitman test in each cluster for each locus to examine evidence of positive selection . For each locus in each strain the population data was compared to an outgroup sequence. For clusters 1 and 2 we used a consensus sequence from the cluster 3 data, and for cluster 3 we used a consensus sequence from the cluster 1 data. Statistical significance was determined using a two-tailed Fisher’s exact test  as implemented in R . A variant of the neutrality index, the direction of selection test (DoS) was used to determine the direction of deviation from neutrality in each loci [32,33].
We used DnaSP to calculate each of our demographic statistics using all sites in each loci . These statistics were: Tajima’s D , Fu & Li’s D  and Fay & Wu’s H . Fu & Li’s D and Fay & Wu’s H require the use of an outgroup to distinguish ancestral and derived alleles. For clusters 1 and 2, we used a consensus sequence from the cluster 3 data to polarize segregating sites. For cluster 3 we used a consensus sequence from the cluster 1 data. Statistical significance was determined for all demographic estimates by simulating 10,000 replicates of the standard neutral model based on the number of segregating sites with no recombination. For Tajima’s D and Fay and Wu’s H these simulations were carried out in the program ms . For Fu & Li’s D, simulations were carried out in DnaSP .
The data set supporting the results of this article are available from the Anaplasma phagocytophilum MLST database, [http://pubmlst.org/aphagocytophilum/], and on Genbank (GenBank accession numbers KF242733 through KF245413, see Additional file 1: Table S4 for more information on individual samples).
Bridgett vonHoldt and two anonymous reviewers provided helpful comments on earlier versions of this work. This research was supported in part by a grant to MLA through the Health Grand Challenge, Center for Health and Wellbeing, Princeton University.
- Toft C, Andersson SGE. Evolutionary microbial genomics: insights into bacterial host adaptation. Nat Rev. 2010;11:465–75.View ArticleGoogle Scholar
- Roche B, Dobson AP, Guégan J-F, Rohani P. Linking community and disease ecology: the impact of biodiversity on pathogen transmission. Phil Trans R Soc B. 2012;367:2807–13.View ArticlePubMed CentralPubMedGoogle Scholar
- Leggett HC, Buckling A, Long GH, Boots M. Generalism and the evolution of parasite virulence. Trends Ecol Evol. 2013;28:592–6.View ArticlePubMedGoogle Scholar
- Stuen S, Granquist EG, Silaghi C. Anaplasma phagocytophilum- a widespread multi-host pathogen with highly adaptive strategies. Front Cell Infect Microbiol. 2013;3:1–33.Google Scholar
- Lajeunesse MJ, Forbes MR. Host range and local parasite adaptation. Proc R Soc Lond B. 2002;269:703–10.View ArticleGoogle Scholar
- Pugliese A. The role of host population heterogeneity in the evolution of virulence. J Biol Dyn. 2011;5:104–19.View ArticlePubMedGoogle Scholar
- Woolfit M, Bromham L. Increased rates of sequence evolution in endosymbiotic bacteria and fungi with small effective population sizes. Mol Biol Evol. 2003;20:1545–55.View ArticlePubMedGoogle Scholar
- Leffler EM, Bullaughey K, Matute DR, Meyer WK, Ségurel L, Venkat A, et al. Revisiting an old riddle: what determines genetic diversity levels within species? PLoS Biol. 2012;10:e1001388.View ArticlePubMed CentralPubMedGoogle Scholar
- Rego ROM, Bestor A, Štefka J, Rosa PA. Population bottlenecks during the infectious cycle of the Lyme disease spirochete Borrelia burgdorferi. PLoS One. 2014;9:e101009.View ArticlePubMed CentralPubMedGoogle Scholar
- Shapiro BJ, Friedman J, Cordero OX, Preheim SP, Timberlake SC, Szabó G, et al. Population genomics of early events in the ecological differentiation of bacteria. Science. 2012;336:48–51.View ArticlePubMed CentralPubMedGoogle Scholar
- Lenski RE, Travisano M. Dynamics of adaptation and diversification: A 10,000-generation experiment with bacterial populations. Proc Natl Acad Sci U S A. 1994;91:6808–14.View ArticlePubMed CentralPubMedGoogle Scholar
- Lang GI, Rice DP, Hickman MJ, Sodergren E, Weinstock GM, Botstein D, et al. Pervasive genetic hitchhiking and clonal interference in forty evolving yeast populations. Nature. 2013;500:571–6.View ArticlePubMed CentralPubMedGoogle Scholar
- Bown KJ, Lambin X, Ogden NH, Begon M, Telford G, Woldehiwet Z, et al. Delineating Anaplasma phagocytophilum ecotypes in coexisting, discrete enzootic cycles. Emerg Infect Dis. 2009;15:1948–54.View ArticlePubMed CentralPubMedGoogle Scholar
- Huhn C, Winter C, Wolfsperger T, Wüppenhorst N, Strašek Smrdel K, Skuballa J, et al. Analysis of the population structure of Anaplasma phagocytophilum using multilocus sequence typing. PLoS One. 2014;9:e93725.View ArticlePubMed CentralPubMedGoogle Scholar
- Van Der Giessen J, Takken W, Van Wieren SE, Takumi K, Sprong H. Circulation of four Anaplasma phagocytophilum ecotypes in Europe. Parasit Vectors. 2014;7:365.View ArticlePubMed CentralPubMedGoogle Scholar
- Meredith RW, Janečka JE, Gatesy J, Ryder OA, Fisher CA, Teeling EC, et al. Impacts of the cretaceous terrestrial revolution and KPg extinction on mammal diversity. Science. 2011;334:521–4.View ArticlePubMedGoogle Scholar
- Ishiguro H, Ichihar Y, Namikawa T, Nagatsu T, Kurosawa Y. Nucleotide sequence of Suncus murinus immunoglobulin μ gene and comparison with mouse and human μ genes. FEBS Lett. 1989;247:317–22.View ArticlePubMedGoogle Scholar
- Nei M, Li WH. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc Natl Acad Sci. 1979;76:5269–73.View ArticlePubMed CentralPubMedGoogle Scholar
- Nei M. Molecular evolutionary genetics. New York: Columbia Univ. Press; 1987.Google Scholar
- Watterson GA. On the number of segregating sites in genetical models without recombination. Theor Popul Biol. 1975;7:256–76.View ArticlePubMedGoogle Scholar
- Kimura M, Crow JF. The number of alleles that can be maintained in a finite population. Genetics. 1964;49:725–38.PubMed CentralPubMedGoogle Scholar
- Drake JW, Charlesworth B, Charlesworth D, Crow JF. Rates of spontaneous mutation. Genetics. 1998;148:1667–86.PubMed CentralPubMedGoogle Scholar
- Dunning Hotopp JC, Lin M, Madupu R, Crabtree J, Angiuoli SV, Eisen J, et al. Comparative genomics of emerging human ehrlichiosis agents. PLoS Genet. 2006;2:e21.View ArticlePubMed CentralPubMedGoogle Scholar
- Maynard Smith J, Smith NH, O’Rourke M, Spratt BG. How clonal are bacteria? Proc Natl Acad Sci U S A. 1993;90:4384–8.View ArticleGoogle Scholar
- Feil EJ, Holmes EC, Bessen DE, Chan M-S, Day NPJ, Enright MC, et al. Recombination within natural populations of pathogenic bacteria: short-term empirical estimates and long-term phylogenetic consequences. Proc Natl Acad Sci. 2001;98:182–7.View ArticlePubMed CentralPubMedGoogle Scholar
- Vos M, Didelot X. A comparison of homologous recombination rates in bacteria and archaea. ISME J. 2009;3:199–208.View ArticlePubMedGoogle Scholar
- Rikihisa Y. Anaplasma phagocytophilum and Ehrlichia chaffeensis: subversive manipulators of host cells. Nat Rev Microbiol. 2010;8:328–39.View ArticlePubMedGoogle Scholar
- Agapow P-M, Burt A. Indices of multilocus linkage disequilibrium. Mol Ecol Notes. 2001;1:101–2.View ArticleGoogle Scholar
- Miyata T, Yasunaga T. Molecular evolution of mRNA: a method for estimating evolutionary rates of synonymous and amino acid substitutions from homologous nucleotide sequences and its applications. J Mol Evol. 1980;16:23–36.View ArticlePubMedGoogle Scholar
- McDonald JH, Kreitman M. Adaptive protein evolution at the Adh locus in Drosophila. Nature. 1991;351:652–4.View ArticlePubMedGoogle Scholar
- Ohta T. Amino acid substitution at the Adh locus of Drosophila is facilitated by small population size. Proc Natl Acad Sci U S A. 1993;90:4549–51.Google Scholar
- Rand DM, Kann A. Polymorphims in mitochondrial DNA: contrasts among genes from Drosophila, mice, and humans. Mol Biol Evol. 1996;13:735–48.View ArticlePubMedGoogle Scholar
- Stoletzki N, Eyre-Walker A. Estimation of the neutrality index. Mol Biol Evol. 2011;28:63–70.View ArticlePubMedGoogle Scholar
- Tajima F. Statistical method for testing the neutral mutation hypothesis of DNA polymorphism. Genetics. 1989;123:585–95.PubMed CentralPubMedGoogle Scholar
- Fu Y-X, Li WH. Statistical tests of neutrality of mutations. Genetics. 1993;133:693–709.PubMed CentralPubMedGoogle Scholar
- Fay JC, Wu CI. Hitchhiking under positive Darwinian selection. Genetics. 2000;155:1405–13.PubMed CentralPubMedGoogle Scholar
- Depaulis F, Mousset S, Veuille M. Power of neutrality tests to detect bottlenecks and hitchhiking. J Mol Evol. 2003;57:S190–200.View ArticlePubMedGoogle Scholar
- Ramírez-Soriano A, Ramos-Onsins SE, Rozas J, Calafell F, Navarro A. Statistical power analysis of neutrality tests under demographic expansions, contractions and bottlenecks with recombination. Genetics. 2008;179:555–67.View ArticlePubMed CentralPubMedGoogle Scholar
- Burbaitė L, Csányi S. Roe deer population and harvest changes in Europe. Est J Ecol. 2009;58:169–80.View ArticleGoogle Scholar
- Krebs CJ. Population fluctuations in rodents. Chicago: University of Chicago Press; 2013.View ArticleGoogle Scholar
- Blaňarová L, Stanko M, Carpi G, Miklisová D, Víchová B, Mošanský L, et al. Distinct Anaplasma phagocytophilum genotypes associated with Ixodes trianguliceps ticks and rodents in Central Europe. Tick Tick-Borne Dis. 2014;5:928–38.View ArticleGoogle Scholar
- Bown KJ, Begon M, Bennett M, Woldehiwet Z, Ogden NH. Seasonal dynamics of Anaplasma phagocytophilum in a rodent-tick (Ixodes trianguliceps) system, United Kingdom. Emerg Infect Dis. 2003;9:63–70.View ArticlePubMed CentralPubMedGoogle Scholar
- Akashi H. Within- and between-species DNA sequence variation and the ‘footprint’ of natural selection. Gene. 1999;238:39–51.View ArticlePubMedGoogle Scholar
- Andolfatto P. Adaptive evolution of non-coding DNA in Drosophila. Nature. 2005;437:1149–52.View ArticlePubMedGoogle Scholar
- Haddrill PR, Bachtrog D, Andolfatto P. Positive and negative selection on noncoding DNA in Drosophila simulans. Mol Biol Evol. 2008;25:1825–34.View ArticlePubMed CentralPubMedGoogle Scholar
- Dingle KE, Colles FM, Wareing DRA, Ure R, Fox AJ, Bolton FE, et al. Multilocus sequence typing system for Campylobacter jejuni. J Clin Microbiol. 2001;39:14–23.View ArticlePubMed CentralPubMedGoogle Scholar
- Feil EJ, Cooper JE, Grundmann H, Robinson DA, Enright MC, Berendt T, et al. How clonal is Staphylococcus aureus? J Bacteriol. 2003;185:3307–16.View ArticlePubMed CentralPubMedGoogle Scholar
- Slatkin M, Hudson RR. Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. Genetics. 1991;129:555–62.PubMed CentralPubMedGoogle Scholar
- Braverman JM, Hudaon RR, Kaplan NL, Langley CH, Stephan W. The hitchhiking effect on the site frequency spectrum of DNA polymorphism. Genetics. 1995;140:783–96.PubMed CentralPubMedGoogle Scholar
- Fu Y-X. Statistical tests of neutrality against population growth, hitchhiking and background selection. Genetics. 1997;147:915–25.PubMed CentralPubMedGoogle Scholar
- Przeworski M. The signature of positive selection at randomly chosen loci. Genetics. 2002;160:1179–89.PubMed CentralPubMedGoogle Scholar
- Carson HL. Increased genetic variance after a population bottleneck. Trends Ecol Evol. 1990;5:228–30.View ArticlePubMedGoogle Scholar
- Batut B, Knibbe C, Marais G, Daubin V. Reductive genome evolution at both ends of the bacterial population size spectrum. Nat Rev Microbiol. 2014;12:841–50.View ArticlePubMedGoogle Scholar
- Ohta T. The nearly neutral theory of molecular evolution. Annu Rev Ecol Syst. 1992;23:263–86.View ArticleGoogle Scholar
- Rejmanek D, Foley P, Barbet A, Foley J. Evolution of antigen variation in the tick-borne pathogen Anaplasma phagocytophilum. Mol Biol Evol. 2012;29:391–400.View ArticlePubMed CentralPubMedGoogle Scholar
- Feil EJ. Linkage, selection and the clonal complex. In: Robinson DA, Falush D, Feil EJ, editors. Bacterial population genetics in infectious disease. Hoboken: John Wiley & Sons Inc; 2010. p. 19–35.View ArticleGoogle Scholar
- Schmid-Hempel P. Evolutionary Parasitology. Oxford: Oxford Univ Press; 2011.Google Scholar
- Taylor LH, Latham SM, Woolhouse MEJ. Risk factors for human disease emergence. Phil Trans R Soc B. 2001;356:983–9.View ArticlePubMed CentralPubMedGoogle Scholar
- Wright S. The genetical structure of populations. Ann Eugenics. 1951;15:323–54.View ArticleGoogle Scholar
- Lynch M, Crease TJ. The analysis of population survey data on DNA sequence variation. Mol Biol Evol. 1990;7:377–94.PubMedGoogle Scholar
- Librado P, Rozas J. DnaSP v5: A software for comprehensive analysis of DNA polymorphism data. Bioinformatics. 2009;25:1451–2.View ArticlePubMedGoogle Scholar
- R Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2014. http://www.r-project.org.Google Scholar
- Fisher RA. On the interpretation of χ2 from contingency tables, and the calculation of P. J R Stat Soc. 1922;85:87–94.View ArticleGoogle Scholar
- Hudson RR. Generating samples under a Wright-Fisher neutral model. Bioinformatics. 2002;18:337–8.View ArticlePubMedGoogle Scholar
- Foley JE, Nieto NC, Barbet A, Foley P. Antigen diversity in the parasitic bacterium Anaplasma phagocytophilum arises from selectively-represented, spatially clustered functional pseudogenes. PLoS One. 2009;4:e8265.View ArticlePubMed CentralPubMedGoogle Scholar
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.