Reconstructing the evolutionary history of the radiation of the land snail genus Xerocrassa on Crete based on mitochondrial sequences and AFLP markers
© Sauer and Hausdorf; licensee BioMed Central Ltd. 2010
Received: 5 May 2010
Accepted: 4 October 2010
Published: 4 October 2010
A non-adaptive radiation triggered by sexual selection resulted in ten endemic land snail species of the genus Xerocrassa on Crete. Only five of these species and a more widespread species are monophyletic in a mitochondrial gene tree. The reconstruction of the evolutionary history of such closely related species can be complicated by incomplete lineage sorting, introgression or inadequate taxonomy. To distinguish between the reasons for the nonmonophyly of several species in the mitochondrial gene tree we analysed nuclear AFLP markers.
Whereas six of the eleven morphologically delimited Xerocrassa species from Crete are monophyletic in the mitochondrial gene tree, nine of these species are monophyletic in the tree based on AFLP markers. Only two morphologically delimited species could not be distinguished with the multilocus data and might have diverged very recently or might represent extreme forms of a single species. The nonmonophyly of X. rhithymna with respect to X. kydonia is probably the result of incomplete lineage sorting, because there is no evidence for admixture in the AFLP data and the mitochondrial haplotype groups of these species coalesce deeply. The same is true for the main haplotype groups of X. mesostena. The nonmonophyly of X. franciscoi might be the result of mitochondrial introgression, because the coalescences of the haplotypes of this species with some X. mesostena haplotypes are shallow and there is admixture with neighbouring X. mesostena.
The most likely causes for the nonmonophyly of species in the mitochondrial gene tree of the Xerocrassa radiation on Crete could be inferred using AFLP data by a combination of several criteria, namely the depth of the coalescences in the gene tree, the geographical distribution of shared genetic markers, and concordance with results of admixture analyses of nuclear multilocus markers. The strongly subdivided population structure increases the effective population size of land snail species and, thus, the likelihood of a long persistence of ancestral polymorphisms. Our study suggests that ancestral polymorphisms are a frequent cause for nonmonophyly of species with a strongly subdivided population structure in gene trees.
The delimitation of closely related species and the reconstruction of their evolutionary history can be complicated by shared ancestral polymorphisms, introgression or inadequate taxonomy . These problems may result in discrepancies between gene trees and a species classification based on other data, which may became apparent if the markers are sequenced from several individuals of each species. The probability that ancestral polymorphisms are shared between species increases with decreased time between speciation events [2–6]. The likelihood that isolation mechanisms are incomplete and introgression happens is also higher if species originated in a short time span and, hence, were at least initially genetically similar [1, 7, 8]. Inadequate taxonomy, the third cause for discrepancies between gene trees and a species classification, is more likely, if several species originated in a short period of time and are morphologically similar. Thus, the delimitation of species and the reconstruction of their relationships are especially challenging in species radiations . Species that are nonmonophyletic in gene trees have been found in diverse groups, e.g., in radiations of East African cichlids [10–12], Darwin's finches , Moorean tree snails , Hawaiian swordtail crickets , North American lycaenid butterflies , ambystomatid salamanders , Argentinean liolaemid lizards , North American crotaphytid lizards , and barley .
Mitochondrial gene tree
According to chi-square tests the base composition at the first and second codon positions of the used COI sequences are not heterogeneous (p = 1.000), but there is significant heterogeneity at the third codon positions (p = 0.016). The results of the matched-pairs tests of symmetry are compatible with these results. According to the matched-pairs tests of symmetry 37.5% of the pairwise comparisons of the nucleotides at the third codon positions indicate significant (p < 0.050) heterogeneity, whereas only 4.0% of the pairwise comparisons of the nucleotides at the first codon positions and 0.0% of the pairwise comparisons of the nucleotides at the second codon positions indicate significant heterogeneity.
To reduce the compositional heterogeneity at the third codon positions we recoded the nucleotides at the third codon positions as purines and pyrimidines. This RY-recoding resulted in a loss of information so that the resulting tree was poorly supported. The haplotype group of X. mesostena that is sister to all other haplotypes of endemic species except X. subvariegata and X. grabusana (Figure 2) became nested in X. siderensis haplotypes from eastern Crete. This haplotype group of X. mesostena is distributed in a region southwest of the Psiloritis Mountains and will be called the 'Psiloritis haplotype group' in the following.
The analysis with the nonstationary model implemented in nhPhyML-Discrete requires a starting tree. We used the maximum likelihood tree obtained with the unmodified dataset as well as the maximum likelihood tree obtained with the RY-recoding of the third codon positions as starting trees. The Psiloritis haplotype group of X. mesostena has the same basal position as in the maximum likelihood tree obtained with the unmodified dataset and the stationary model (Figure 2A) in both resulting trees. According to the approximately unbiased test, the tree obtained using the maximum likelihood tree calculated based on the unmodified dataset as starting tree (Figure 2B) was significantly (p < 0.001) better than the tree obtained using the other starting tree.
Both reconstructions of the mitochondrial gene tree (Figure 2) show that the haplotypes of the morphologically delimited species X. cretica, X. subvariegata, X. grabusana, X. lasithiensis, X. heraklea and X. kydonia form monophyletic groups, but the haplotypes of the other five morphologically delimited species do not. The haplotypes of the widespread X. mesostena are paraphyletic with respect to all other endemic Xerocrassa species with the exception of X. subvariegata and X. grabusana and form two deeply separated main groups. Whereas the majority of the X. mesostena individuals have haplotypes that form a terminal bush-like group, most individuals living in a region southwest of the Psiloritis Mountains have mitochondrial haplotypes that form a strongly supported early branch in the mitochondrial gene tree, the 'Psiloritis haplotype group'. The haplotypes of X. rhithymna are paraphyletic with respect to X. kydonia. The haplotypes of X. franciscoi are nested in the major X. mesostena group. In the tree calculated with the stationary model (Figure 2A) most haplotypes of X. amphiconus form a clade that is nested in some haplotypes of X. siderensis, which together form the sister group of the X. heraklea/rhithymna/kydonia clade. The second group of X. siderensis haplotypes forms the sister clade of the major X. mesostena group. In contrast, the two haplotype clades of X. amphiconus and X. siderensis form a single clade, which is the sister group of the X. heraklea/rhithymna/kydonia clade, in the tree calculated with the nonstationary model (Figure 2B). In both reconstructions, one haplotype of X. amphiconus is nested in the major X. mesostena group.
Tree and network based on AFLP data
X. mesostena as defined by morphological characters forms a separate group in the tree (Figure 3) and the neighbor-net (Figure 4) based on AFLP markers. However, this group is divided into distinct subgroups by deep splits. These subgroups are geographically more or less separated and might be considered cryptic species. However, they are neither congruent with mitochondrial haplotype groups nor are they correlated with morphological differences. For example, one of the AFLP based subgroups is restricted to the region southwest of the Psiloritis Mountains where also the Psiloritis haplotype group occurs. Most of the X. mesostena individuals that are characterized by Psiloritis haplotypes are concentrated in the geographically corresponding AFLP cluster (Figs. 3 and 4). However, there are also individuals with other haplotypes in this AFLP cluster and some individuals with Psiloritis haplotypes belong to other AFLP clusters. This demonstrates that there is gene flow between the metapopulations corresponding to the AFLP clusters.
Species delimitation based on Gaussian clustering of AFLP data
Species delimitation based on STRUCTURE analyses
The mean estimates of the posterior probabilities of the data had a maximum at K = 9, if the model with admixture (Figure 6C) was used. However, the highest likelihood has been obtained in a run with K = 13. ΔK showed again a maximum at K = 2 (Figure 6D). In the run with K = 9 that had the highest likelihood the clusters correspond to X. cretica, X. subvariegata + X. grabusana, X. kydonia + X. rhithymna + X. heraklea + X. lasithiensis, X. franciscoi, X. amphiconus + X. siderensis and four clusters including parts of the X. mesostena complex.
Species delimitation based on a STRUCTURAMA analysis
STRUCTURAMA calculated the posterior probability that the dataset includes 3 clusters as 1.0. In the mean partition these clusters correspond to X. cretica, X. subvariegata + X. grabusana, and a complex including the other eight endemic Cretan Xerocrassa species.
Admixture analyses for inferring introgression
An admixture analysis with the AFLP data of X. rhithymna and X. kydonia with K = 2 revealed that all X. rhithymna individuals including the specimen with the haplotype that is sister to the X. kydonia haplotypes had an inferred ancestry of 99.5-99.8% in their own cluster. That means that there is no evidence for introgression of X. kydonia alleles into X. rhithymna individuals. The inferred ancestry of five X. kydonia individuals was also between 99.3-99.7% in their own cluster, but one individual had an inferred ancestry of only 86.5% in its own cluster and 13.5% in the X. rhithymna cluster.
An admixture analyses with the AFLP data of X. franciscoi individuals and the geographically neighbouring ten individuals of X. mesostena with K = 2 showed that the X. franciscoi individuals had an inferred ancestry of 93.6-99.9% in their own cluster. Nine of the ten X. mesostena individuals had an inferred ancestry of 99.6-99.9% in the X. mesostena cluster, but the one from Ano Kapetaniana 3.5 km towards Agios Ioannis, only a few hundred meters from the boundary of the distribution area of X. franciscoi, had an inferred ancestry of only 86.1% in the X. mesostena cluster and 13.9% in the X. franciscoi cluster. This indicates that there might be some introgression between X. mesostena and X. franciscoi.
Species delimitation in a radiation
There are strong discrepancies between the mitochondrial gene tree of the radiation of the land snail genus Xerocrassa on Crete based on COI sequences (Figure 2) and the species classification based on morphological characters of the shell and the genitalia . In the mitochondrial gene tree only six of the eleven morphologically defined Xerocrassa species living on Crete are monophyletic. A topology test showed that the lack of monophyly of the morphologically delimited species is not the result of a poor resolution of the mitochondrial gene tree . We could also exclude the possibility that the lack of monophyly of the morphologically delimited species in the mitochondrial gene tree is an artefact resulting from systematic errors in tree reconstruction due to compositional bias by using a nonstationary model for the maximum likelihood analyses (Figure 2B).
We generated a multilocus dataset using AFLP markers to investigate whether the discrepancies between the morphologically based species classification and the mitochondrial gene tree are artefacts resulting from inadequate taxonomy or whether they can be explained by evolutionary processes. Nine of the eleven morphologically delimited Cretan Xerocrassa species form separate groups in the tree (Figure 3) and the neighbor-net (Figure 4) based on 1476 AFLP markers. Thus, the AFLP data corroborate the morphological species classification with the exception of the separation of X. amphiconus and X. siderensis, which together form a separate clade, but are intermingled within this clade. This species pair is also exceptional with regard to morphology and distribution. Whereas most other Xerocrassa species can be distinguished by characters of the genitalia, the genitalia of X. amphiconus and X. siderensis show no differences. These two species differ only in shell characters. Whereas the other endemic species have largely allopatric distribution areas, the ranges of X. amphiconus and X. siderensis broadly overlap. Nevertheless, they usually do not occur together. They tend to prefer different altitudinal zones , though there are several populations of each species occurring in the zone preferred by the other species. Only few individuals show intermediate shell characters indicating possible hybridization. The lack of clear genetic differentiation of the two species might indicate that the species diverged only very recently. At present, we cannot rule out the possibility that the two forms actually represent only extreme morphs of a single species.
We also applied three methods for delimiting provisional species based on dominant multilocus markers without any a priori knowledge. We compared the performance of Gaussian clustering [25, 26] with two Bayesian approaches that are based on the assumption of approximate Hardy-Weinberg equilibrium within clusters, namely STRUCTURE [27, 28] and STRUCTURAMA , in delimiting species of the Cretan Xerocrassa radiation.
The classification of the Cretan Xerocrassa specimens produced using Gaussian clustering based on the AFLP data is most similar to the morphological classification, but differs from it in combining three morphologically differentiated species pairs and in splitting the two most widespread species into two, respectively three groups. The major disadvantage of STRUCTURE was its inability to determine the number of species that can be distinguished. Compared to STRUCTURE, the advantage of STRUCTURAMA is that it directly estimates the number of clusters into which a sample can be divided. However, compared with Gaussian clustering the classification success of STRUCTURAMA was much lower. Only three provisional species were delimited. Thus, this approach failed to distinguish eight taxa that can be distinguished morphologically and form separate groups in the tree (Figure 3) and the neighbor-net (Figure 4) based on the AFLP data.
The problems in determining the appropriate number of clusters with STRUCTURE and the low resolution obtained with STRUCTURAMA confirmed former studies suggesting that with many dominant markers strategies coupling ordination and cluster analyses like Gaussian clustering become more efficient in species delimitation than Bayesian approaches that are based on the assumption of Hardy-Weinberg equilibrium within clusters [26, 30]. However, none of the methods used recognized all species that can be distinguished by morphological characters and that form separate groups in the tree (Figure 3) and the neighbor-net (Figure 4) based on the AFLP data. This failure is probably the result of the low number of specimens sampled of the regionally more restricted species. With few sampled individuals separate clusters cannot be recognized and specimens of the underrepresented species are included in clusters of the most similar species, if no noise component is used in the analysis. The introduction of a noise component to include outliers in Gaussian clustering is meaningful, because it indicates problems in the data, in this case insufficient sampling of the regionally restricted species, that might remain unrecognized otherwise.
The problem in recognizing underrepresented species has important consequences for DNA-based biodiversity surveys. Because there are usually many rare species and just a few common species , many rare species will remain undiscovered, if all species are sampled randomly. Therefore, investing in a morphological prescreening to raise the representation of rare species in DNA-based surveys might increase the effectiveness of such surveys considerably.
The genotypic clusters determined with Gaussian clustering and the two Bayesian approaches correspond to species in some cases (e.g., X. heraklea and X. lasithiensis in the solution found with Gaussian clustering or X. cretica in the solution found with STRUCTURAMA), but they may also correspond to other groups like geographically isolated meta-populations (e.g., geographical subgroups of X. mesostena in the solution found with Gaussian clustering) or groups of species (e.g., X. kydonia + X. rhithymna in the solution found with Gaussian clustering or X. subvariegata + X. grabusana in the solution found with STRUCTURAMA). Thus, the determination of the status of such groups should be corroborated by a comparison with a classification based on other evidence. We compare the partitions obtained with the multilocus markers with the morphological classification that is mainly based on differences in the genitalia that may directly be involved in reproductive isolation.
Causes of the nonmonophyly of species in the mitochondrial gene tree
The nonmonophyly of X. mesostena, X. franciscoi and X. rhithymna in the mitochondrial gene tree (Figure 2) are not likely explained by inadequate taxonomy, because they form separate groups in the tree (Figure 3) and the neighbor-net (Figure 4) based on AFLP markers. We analyzed the potential mechanisms resulting in the nonmonophyly of these species in the mitochondrial gene tree by applying several criteria that have been proposed to discriminate between incomplete lineage sorting and introgression. These criteria are the depth of the coalescences in the mitochondrial tree [1, 18–20, 32, 33], the geographical occurrence of the individuals of species that are nonmonophyletic in the gene tree [1, 16, 18–20, 34], and concordance with results of admixture analyses of nuclear multilocus markers [16, 35].
The most remarkable pattern is seen in X. mesostena, the most widespread of the endemic species. Whereas individuals living in a region southwest of the Psiloritis Mountains have mitochondrial haplotypes that form a strongly supported early branch in the COI gene tree (Figure 2), the individuals from other regions of the island have haplotypes that form a terminal bush-like group. In the tree (Figure 3) and the neighbor-net (Figure 4) based on AFLP markers individuals with Psiloritis haplotypes are intermingled with individuals with other haplotypes indicating that there is gene flow between populations with different mitochondrial haplotype groups. There is no indication that one of the two haplotype groups found in X. mesostena was introduced into that species by introgression, because these haplotype groups were not found in other species, with the exception of X. franciscoi and one X. amphiconus specimen that have haplotypes that are nested in the major X. mesostena haplotype group. Thus, the Psiloritis haplotype group represents probably an ancestral polymorphism that was conserved in X. mesostena. It can be considered a paradigm for one of the causes of the high mitochondrial sequence diversity within land snail species discussed by Thomaz et al. , namely long-term persistence of ancient polymorphisms resulting from the strongly subdivided population structure of land snails. The population structure of land snail species often consisting of many more or less isolated populations [37–40] that sometimes can reach high densities in favourable patches of habitat in conducive to the persistence of ancestral polymorphisms, because such a population structure increases the effective population size . A more detailed analysis of the phylogeographic structure within X. mesostena is in preparation.
The second case of nonmonophyly in the mitochondrial gene tree (Figure 2) concerns X. rhithymna. In the COI gene tree X. rhithymna is paraphyletic with respect to X. kydonia. The two species are sister species according to the AFLP tree. The ranges of the two species are separated by more than 40 km . Thus, dispersal of X. rhithymna individuals to the range of X. kydonia and hybridization between the two species are rare events at most. This is also confirmed by an admixture analysis of the AFLP data that did not provide evidence for admixture, neither for the X. rhithymna individual with the haplotype that is sister to the X. kydonia haplotypes nor for any other X. rhithymna individual. A small amount of admixture in one X. kydonia individual is not necessarily the result of introgression, but might be due to shared ancestral polymorphisms. Thus, it is more likely that the nonmonophyly of X. rhithymna in the mitochondrial gene tree is the result of incomplete lineage sorting and not caused by an introgression of a X. kydonia haplotype into X. rhithymna. This is further supported by the deep coalescence of the X. rhithymna haplotype which is more closely related to the X. kydonia haplotypes than to the other X. rhithymna haplotypes.
The last species that is monophyletic in the AFLP tree (Figure 3), but not in the mitochondrial gene tree (Figure 2), is X. franciscoi. The small range of this endemic species adjoins directly on the range of X. mesostena. Its COI haplotypes are nested within the main COI haplotype group of X. mesostena and are separated from X. mesostena haplotypes only by shallow distances. This might indicate introgression. Actually, the STRUCTURE analysis of the AFLP data shows admixture between X. franciscoi and a representative of the neighbouring X. mesostena population in agreement with the observation of a narrow hybrid zone between the two species . However, the nonmonophyly of the mitochondrial haplotypes of X. franciscoi, the shallow distances between them and X. mesostena haplotypes and the admixture of X. franciscoi and neighbouring X. mesostena can also be explained by a recent origin of X. franciscoi from X. mesostena by peripatric speciation. It is difficult to determine the relative roles of incomplete lineage sorting and hybridization in generating nonmonophyly of recently separated species in gene trees.
The mitochondrial gene tree hints to the phylogenetic origin of X. franciscoi from neighbouring populations of the X. mesostena complex, whereas the AFLP tree reflects the genetic cohesion of the individuals of each of the species caused by intraspecific gene flow and recombination that, on the other hand, obscured the details of the relations between the species. Thus, both marker types supply complementary information with regard to the phylogenetic history of the Xerocrassa radiation on Crete, similar to the situation concerning the radiation of the cichlid genus Tropheus in Lake Tanganyika .
The discrimination of incomplete lineage sorting of ancestral polymorphisms and introgression as causes of nonmonophyly of species in gene trees is difficult [1, 16, 18–20, 32–35]. Our study showed that the most likely cause of nonmonophyly can be inferred at least in some cases by a combination of several criteria, namely the depth of the coalescences in the gene tree, the geographical distribution of shared genetic markers, and concordance with results of admixture analyses of nuclear multilocus markers. However, all these criteria have limitations. Randomly distributed genetic markers shared with allopatric species with limited dispersal abilities might indicate incomplete lineage sorting. However, the expectation that genetic markers that are concentrated geographically near species boundaries indicate introgression (e.g., [1, 19]) is not necessarily true, because such a pattern might also be derived from a pre-existing cline in the stem species. Likewise, introgression of mitochondrial DNA cannot be completely excluded, even if there is no evidence for admixture of multilocus markers, because maternally inherited DNA like mitochondrial DNA may introgress much more rapidly through prezygotic barriers than biparentally inherited DNA [7, 8]. Such shortcomings of individual criteria are ameliorated by using several criteria in an integrative approach. Gene trees of several additional genes would provide more definitive evidence for discriminating between incomplete lineage sorting and hybridization.
Most cases of nonmonophyly of species in the mitochondrial gene tree of the Xerocrassa radiation on Crete are not caused by inappropriate morphological taxonomy, but are the result of evolutionary processes. By using nuclear multilocus data and a combination of several criteria, namely the depth of the coalescences in the gene tree, the geographical distribution of shared genetic markers, and concordance with results of admixture analyses of nuclear multilocus markers we could infer that some of these cases are probably the result of incomplete lineage sorting, whereas introgression might have been involved in other cases. Although most species are monophyletic in a tree based on the AFLP data, methods for delimiting genotypic clusters based on multilocus data alone did not recognize all these species. This is at least partly the result of an unequal representation of the species in the dataset and highlights the importance of a morphological prescreening to raise the representation of rare species in DNA-based biodiversity surveys.
Snails were sampled on Crete in July/August and September/October 2004 and September/October 2005. AFLP data were determined from 150 Xerocrassa specimens from 124 localities on Crete covering all morphotypes and all regions of Crete (Figure 1) and one Xerocrassa cretica from Samos. The classification, AFLP data, locality and voucher data for the Xerocrassa specimens used in this study can be found in Additional File 1.
Total genomic DNA was extracted from tissue samples of the foot preserved in 100% isopropanol following the protocol proposed by Sokolov  with slight modifications as detailed in Sauer & Hausdorf .
Fragments of the cytochrome c oxidase subunit 1 (COI) gene have been previously sequenced (Sauer & Hausdorf 2009). The sequences analyzed in this paper have been deposited in GenBank under the accession numbers FJ627054-FJ627177. The used alignment is available at TreeBASE http://www.treebase.org, accession number S2413.
Approximately 100 ng genomic DNA were digested with 5 units EcoRI (Fermentas) at 37°C for 1 h followed by a digestion with 5 units of MseI (Fermentas) at 65°C for 1 h. 12.5 pmol of the EcoRI-adapter, 125 pmol of the MseI-adapter and 10 units of T4 DNA ligase and its buffer (GeneCraft) were added to the digestion product and incubated at 16°C for 8 h. The ligation products were diluted 1:10 with sterile ddH2O, and stored at -20°C.
Primers and fluorescent dye labels used for AFLP.
5'- GAC TGC GTA CCA ATT CA -3'
5'- GAT GAG TCC TGA GTA AC -3'
5'- GAC TGC GTA CCA ATT CA CA -3'
5'- GAC TGC GTA CCA ATT CA CC -3'
5'- GAC TGC GTA CCA ATT CA GG -3'
5'- GAT GAG TCC TGA GTA AC AG -3'
5'- GAT GAG TCC TGA GTA AC TG -3'
Five primers with two additional bases at the 3' end (Table 1) were used for selective amplifications. Six primer combinations (SMseI/SEcoRIDYE) were run: AG/CAFAM, AG/CCNED, AG/GGHEX, TG/CAFAM, TG/CCNED and TG/GGHEX. 5 μl of the diluted preselective PCR product were added to 20 μl of the selective PCR master mix consisting of 15.05 μl ddH2O, 2.5 μl 10× PCR-buffer, 1.75 μl MgCl2 (50 mM), 0.25 μl dNTP (2 mM), 0.2 μl dye primer mix (0.06 μM labelled selective EcoRI-Primer and 0.6 μM non-labelled selective MseI-Primer), and 0.25 μl Taq DNA polymerase (5U/μl). For the selective amplification a touch down PCR with a temperature decrease of 0.6°C of the annealing temperature each cycle was applied. The program starts with 94°C for 60 s, 65°C for 30 s and 72°C for 60 s followed by 13 cycles of 0.6°C decrease of annealing temperature and 1°C decrease of elongation temperature per cycle and 23 cycles with 94°C for 60 s, 56°C for 30 s and 72°C for 60 s.
1.2 μl of each of the three different primer labelled samples were mixed with 6.2 μl Hi Dye Formamid (Applied Biosystems) and 0.2 μl GS-500 ROX size standard (Applied Biosystems). The samples were denatured at 94°C for 2 min and then cooled down on ice for 4 min. The selective PCR products were electrophoretically separated using pop4-polymer (Applied Biosystems) on an ABI PRISM 3100 (Applied Biosystems) capillary sequencer.
Signal detection was performed with GeneScan version 3.1 (Applied Biosystems). Fluorescent threshold was set to 50 relative fluorescence units. The signal intensity was normalized with Genotyper version 2.5 (Applied Biosystems). Fixed fragment categories were created. A presence/absence scoring was conducted between 70 and 322 bases with a threshold set to 50 normalized units. Category spacing was set to 1 base and the category tolerance was adjusted to +/- 0.5 bases.
We exemplarily tested the reproducibility of the AFLP by repeating the steps from DNA extraction to selective PCR with one primer combination for three samples. The banding patterns were very similar in all cases. However, repeatability of fragments dropped distinctly above a fragment length larger than 322 bp. Thus, we did not score fragments above this size.
Models of sequence evolution for the maximum likelihood analyses were chosen using ModelTest version 3.7  based on the Akaike Information Criterion. A partitioning of the dataset with separate models for the three codon positions was evaluated in comparison with a uniform model for the complete dataset. Maximum likelihood analyses were conducted with Treefinder [44, 45]. Confidence values were computed by bootstrapping (100 replications; ).
We checked the homogeneity of base frequencies across taxa using the chi-square test implemented in PAUP* 4.0 beta 10 . However, this test ignores correlation due to phylogenetic structure. Therefore, we also measured the probability that the base composition of two sequences is homogeneous for each pair of sequences using the matched-pairs test of symmetry as implemented in SeqVis version 1.4 .
To reduce compositional heterogeneity at the third codon positions we recoded these positions into 2-state categories by pooling purines (adenine and guanine: R) and pyrimidines (cytosine and thymine: Y) (RY-coding, see ) using the GTR2 model in Treefinder. In addition we analyzed the data set using the nonstationary model of evolution of Galtier & Gouy  as implemented in nhPhyML-Discrete , limited to 5 base content frequency categories and with 6 categories for a discrete gamma model of among-site rate variation. Trees obtained by nhPhyML-Discrete were then compared using the approximately unbiased test  implemented in CONSEL .
Jaccard distances were calculated from the AFLP data using PhylTools version 1.32 . These distances were used to reconstruct neighbor-joining trees with PHYLIP version 3.66  and phylogenetic networks with the neighbor-net algorithm  implemented in SplitsTree4 version 4.6 . Confidence values for the edges of the neighbor-joining tree were computed by bootstrapping (1000 replications).
To show that the discordances between the mitochondrial gene tree and the tree based on the AFLP data are not the result of a poor resolution of the mitochondrial gene tree, we calculated the maximum-likelihood tree based on the mitochondrial sequences under the constraint that the sequences of each species form a clade (with the exception of the X. amphiconus-siderensis complex, which was constraint to be a single clade) as in the tree based on the AFLP data and that the relationships between the species correspond with the relationships in that tree, using the 'resolve multifurcations' option of Treefinder. Then we investigated whether this constrained tree can be rejected in comparison with the unconstrained maximum-likelihood tree by applying the approximately unbiased test  as implemented in Treefinder.
The delimitation of provisional species using genetic data or (presumably) genetically inherited morphological characters is based on the criterion that species are groups of organisms with similar genotypes as suggested in the genotypic cluster definition of species given by Mallet . This criterion will usually be compatible with other criteria such as intrinsic reproductive isolation , because the lack of gene flow between reproductively isolated groups of organisms will result in an accumulation of differences due to differential adaptation and genetic drift that make these groups recognizable as different genotypic clusters. However, species in the earliest stages of speciation might differ only in a few genes that are responsible for differential adaptation or reproductive isolation. Such emerging species will hardly be recognizable using the genotypic cluster criterion.
We applied two strategies for delimiting provisional species. The first is based on a comparison of three independent datasets, namely (1) morphological data of the genitalia and the shell as summarized in the current species classification , (2) the structure of the mitochondrial gene tree and (3) the structure of the tree and the network based on AFLP data. Congruence between morphologically delimited groups, clades in the mitochondrial gene tree and/or in the tree or the network based on the AFLP data corroborates that such groups are evolutionary units that can be considered as provisional species.
The second strategy is to determine species boundaries based on the multilocus data without considering the morphologically based species classification or the mitochondrial sequence data. Hausdorf & Hennig  compared three methods for delimiting species based on dominant multilocus markers. We applied these methods to the AFLP data of the Cretan Xerocrassa.
Hausdorf & Hennig  proposed to use Gaussian clustering  for the determination of clusters of specimens as provisional species. We used the implementation of MCLUST  in the program package PRABCLUS version 2.1-1  that is an add-on package for R . We performed the mixture estimation for 0 to 100 clusters with as well as without a noise component for outliers. Gaussian clustering operates on a dataset where the cases are defined by variables of metric scale. Therefore we performed a non-metric multidimensional scaling  using four dimensions and Jaccard distances as recommended by Link et al.  for dominant genetic markers.
Shaffer & Thomson  proposed to use STRUCTURE [27, 28] to delimit the lower bound of potential species. STRUCTURE is a model-based clustering method using multilocus data to infer population structure and assign individuals to populations under the assumption of Hardy-Weinberg equilibrium within populations. We used STRUCTURE version 2.3.1 with the model without admixture as well as with the model with admixture. 10 runs with 100,000 iterations after a burn-in of 10,000 iterations were carried out in order to quantify the amount of variation of the posterior probabilities of the data for each cluster number K for K between 1 and 20. We used the mean estimates of the posterior probabilities of the data for a given cluster number L(K) and the statistic ΔK proposed by Evanno et al.  to estimate the number of clusters.
We also applied the program STRUCTURAMA  that is also a Bayesian approach designed for inferring population structure by clustering individuals such that Hardy-Weinberg equilibrium is maximized within clusters. However, in contrast to STRUCTURE, the number of clusters and assignment of individuals to clusters can both be considered random variables that follow a Dirichlet process prior. Thus, this method can directly estimate the number of clusters in which a sample can be divided. We allowed the number of populations to be a random variable with a Dirichlet process prior, run the Markov chain Monte Carlo analysis for 1,000,000 cycles, sampled every 100th cycle, and discarded the first 4,000 samples as burn-in.
Admixture analysis for inferring introgression
We used admixture analyses of the AFLP data with STRUCTURE also to investigate whether cases of nonmonophyly in the mitochondrial gene tree can be ascribed to introgression. For that purpose the species were analysed pairwise with K = 2. We carried out 5 separate runs with 100,000 iterations after a burn-in of 10,000 iterations and analysed the run with the highest likelihood.
We are grateful to Barbara Rudolph and Fabian Herder for advice concerning AFLP, and Stephanie Bartel and Ilse-Dore Schmidt for help in the lab. This study was funded by the priority program "Radiations - Origins of Biological Diversity" of the Deutsche Forschungsgemeinschaft (HA 2763/3-1,2).
- Funk DJ, Omland KE: Species-Level paraphyly and polyphyly: frequency, causes, and consequences, with insights from animal mitochondrial DNA. Annu Rev Ecol Evol System. 2003, 34: 397-423. 10.1146/annurev.ecolsys.34.011802.132421.View ArticleGoogle Scholar
- Neigel JE, Avise JC: Phylogenetic relationships of mitochondrial DNA under various demographic models of speciation. Evolutionary Processes and Theory. Edited by: Karlin S, Neto E. 1986, New York: Academic Press, 515-534.Google Scholar
- Avise JC, Wollenberg K: Phylogenetics and the origin of species. Proc Natl Acad Sci USA. 1997, 94: 7748-7755. 10.1073/pnas.94.15.7748.PubMed CentralView ArticlePubMedGoogle Scholar
- Maddison WP: Gene Trees in Species Trees. Syst Biol. 1997, 46: 523-536.View ArticleGoogle Scholar
- Hudson RR, Coyne JA: Mathematical consequences of the genealogical species concept. Evolution. 2002, 56: 1557-1565.View ArticlePubMedGoogle Scholar
- Rosenberg NA: The shapes of neutral gene genealogies in two species: probabilities of monophyly, paraphyly, and polyphyly in a coalescent model. Evolution. 2003, 57: 1465-1477.View ArticlePubMedGoogle Scholar
- Takahata N, Slatkin M: Mitochondrial gene flow. Proc Natl Acad Sci USA. 1984, 81: 1764-1767. 10.1073/pnas.81.6.1764.PubMed CentralView ArticlePubMedGoogle Scholar
- Chan KMA, Levin SA: Leaky prezygotic isolation and porous genomes: rapid introgression of maternally Inherited DNA. Evolution. 2005, 59: 720-729.View ArticlePubMedGoogle Scholar
- Shaffer HB, Thomson RC: Delimiting species in recent radiations. Syst Biol. 2007, 56: 896-906. 10.1080/10635150701772563.View ArticlePubMedGoogle Scholar
- Nagl S, Tichy H, Mayer WE, Takahata N, Klein J: Persistence of neutral polymorphisms in Lake Victoria cichlid fish. Proc Natl Acad Sci USA. 1998, 95: 14238-14243. 10.1073/pnas.95.24.14238.PubMed CentralView ArticlePubMedGoogle Scholar
- Albertson RC, Markert JA, Danley PD, Kocher TD: Phylogeny of a rapidly evolving clade: the cichlid fishes of Lake Malawi, East Africa. Proc Natl Acad Sci USA. 1999, 96: 5107-5110. 10.1073/pnas.96.9.5107.PubMed CentralView ArticlePubMedGoogle Scholar
- Egger B, Koblmuller S, Sturmbauer C, Sefc K: Nuclear and mitochondrial data reveal different evolutionary processes in the Lake Tanganyika cichlid genus Tropheus. BMC Evol Biol. 2007, 7: 137-10.1186/1471-2148-7-137.PubMed CentralView ArticlePubMedGoogle Scholar
- Petren K, Grant PR, Grant BR, Keller LF: Comparative landscape genetics and the adaptive radiation of Darwin's finches: the role of peripheral isolation. Mol Ecol. 2005, 14: 2943-2957. 10.1111/j.1365-294X.2005.02632.x.View ArticlePubMedGoogle Scholar
- Lee T, Burch J, Coote T, Pearce-Kelly P, Hickman C, Meyer J-Y, O Foighil D: Moorean tree snail survival revisited: a multi-island genealogical perspective. BMC Evol Biol. 2009, 9: 204-10.1186/1471-2148-9-204.PubMed CentralView ArticlePubMedGoogle Scholar
- Shaw KL: Conflict between nuclear and mitochondrial DNA phylogenies of a recent species radiation: What mtDNA reveals and conceals about modes of speciation in Hawaiian crickets. Proc Natl Acad Sci USA. 2002, 99: 16122-16127. 10.1073/pnas.242585899.PubMed CentralView ArticlePubMedGoogle Scholar
- Gompert Z, Forister ML, Fordyce JA, Nice CC: Widespread mito-nuclear discordance with evidence for introgressive hybridization and selective sweeps in Lycaeides. Mol Ecol. 2008, 17: 5231-5244. 10.1111/j.1365-294X.2008.03988.x.View ArticlePubMedGoogle Scholar
- Weisrock DW, Shaffer HB, Storz BL, Storz SR, Voss SR: Multiple nuclear gene sequences identify phylogenetic species boundaries in the rapidly radiating clade of Mexican ambystomatid salamanders. Mol Ecol. 2006, 15: 2489-2503. 10.1111/j.1365-294X.2006.02961.x.View ArticlePubMedGoogle Scholar
- Morando M, Avila LJ, Baker J, Sites JWJ: Phylogeny and phylogeography of the Liolaemus darwinii complex (Squamata: Liolaemidae): evidence for introgression and incomplete lineage sorting. Evolution. 2004, 58: 842-861.View ArticlePubMedGoogle Scholar
- McGuire JA, Linkem CW, Koo MS, Hutchison DW, Lappin AK, Orange DI, Lemos-Espinal J, Riddle BR, Jaeger JR: Mitochondrial introgression and incomplete lineage sorting through space and time: phylogenetics of crotaphytid lizards. Evolution. 2007, 61: 2879-2897. 10.1111/j.1558-5646.2007.00239.x.View ArticlePubMedGoogle Scholar
- Jakob SS, Blattner FR: A chloroplast genealogy of Hordeum (Poaceae): long-term persisting haplotypes, incomplete lineage sorting, regional extinction, and the consequences for phylogenetic inference. Mol Biol Evol. 2006, 23: 1602-1612. 10.1093/molbev/msl018.View ArticlePubMedGoogle Scholar
- Hausdorf B, Sauer J: Revision of the Helicellinae of Crete (Gastropoda: Hygromiidae). Zool J Linn Soc. 2009, 157: 373-419. 10.1111/j.1096-3642.2008.00504.x.View ArticleGoogle Scholar
- Sauer J, Hausdorf B: Sexual selection is involved in speciation in a land snail radiation on Crete. Evolution. 2009, 63: 2535-2546. 10.1111/j.1558-5646.2009.00751.x.View ArticlePubMedGoogle Scholar
- Vos P, Hogers R, Bleeker M, Reijans M, Vandelee T, Hornes M, Frijters A, Pot J, Peleman J, Kuiper M, Zabeau M: AFLP - a new nechnique for DNA-fingerprinting. Nucleic Acids Res. 1995, 23: 4407-4414. 10.1093/nar/23.21.4407.PubMed CentralView ArticlePubMedGoogle Scholar
- Evanno G, Regnaut S, Goudet J: Detecting the number of clusters of individuals using the software structure: a simulation study. Mol Ecol. 2005, 14: 2611-2620. 10.1111/j.1365-294X.2005.02553.x.View ArticlePubMedGoogle Scholar
- Fraley C, Raftery AE: How many clusters? Which clustering method? Answers via model-based cluster analysis. Comput J. 1998, 41: 578-588. 10.1093/comjnl/41.8.578.View ArticleGoogle Scholar
- Hausdorf B, Hennig C: Species Delimitation Using Dominant and Co-Dominant Multi-Locus Markers. Syst Biol. 2010Google Scholar
- Pritchard JK, Stephens M, Donnelly P: Inference of population structure using multilocus genotype data. Genetics. 2000, 155: 945-959.PubMed CentralPubMedGoogle Scholar
- Falush D, Stephens M, Pritchard JK: Inference of population structure using multilocus genotype data: dominant markers and null alleles. Mol Ecol Notes. 2007, 7: 574-578. 10.1111/j.1471-8286.2007.01758.x.PubMed CentralView ArticlePubMedGoogle Scholar
- Huelsenbeck JP, Andolfatto P: Inference of population structure under a Dirichlet process prior. Genetics. 2007, 175: 1787-1802. 10.1534/genetics.106.061317.PubMed CentralView ArticlePubMedGoogle Scholar
- Reeves PA, Richards CM: Accurate Inference of Subtle Population Structure (and Other Genetic Discontinuities) Using Principal Coordinates. PLoS ONE. 2009, 4: e4269-10.1371/journal.pone.0004269.PubMed CentralView ArticlePubMedGoogle Scholar
- McGill BJ, Etienne RS, Gray JS, Alonso D, Anderson MJ, Benecha HK, Dornelas M, Enquist BJ, Green JL, He F, et al: Species abundance distributions: moving beyond single prediction theories to integration within an ecological framework. Ecol Lett. 2007, 10: 995-1015. 10.1111/j.1461-0248.2007.01094.x.View ArticlePubMedGoogle Scholar
- Holder MT, Anderson JA, Holloway AK: Difficulties in detecting hybridization. Syst Biol. 2001, 50: 978-982. 10.1080/106351501753462911.View ArticlePubMedGoogle Scholar
- Joly S, McLenachan PA, Lockhart PJ: A statistical approach for distinguishing hybridization and incomplete lineage sorting. Am Nat. 2009, 174: E54-E70. 10.1086/600082.View ArticlePubMedGoogle Scholar
- Donnelly MJ, Pinto J, Girod R, Besansky NJ, Lehmann T: Revisiting the role of introgression vs shared ancestral polymorphisms as key processes shaping genetic diversity in the recently separated sibling species of the Anopheles gambiae complex. Heredity. 2004, 92: 61-68. 10.1038/sj.hdy.6800377.View ArticlePubMedGoogle Scholar
- Berthier P, Excoffier L, Ruedi M: Recurrent replacement of mtDNA and cryptic hybridization between two sibling bat species Myotis myotis and Myotis blythii. Proc R Soc Lond B Biol Sci. 2006, 273: 3101-3123. 10.1098/rspb.2006.3680.View ArticleGoogle Scholar
- Thomaz D, Guiller A, Clarke B: Extreme divergence of mitochondrial DNA within species of pulmonate land snails. Proc R Soc Lond B Biol Sci. 1996, 263: 363-368. 10.1098/rspb.1996.0056.View ArticleGoogle Scholar
- Boato A: Microevolution in Solatopupa landsnails (Pulmonata, Chondrinidae) - Genetic diversity and founder effects. Biol J Linn Soc Lond. 1988, 34: 327-348. 10.1111/j.1095-8312.1988.tb01967.x.View ArticleGoogle Scholar
- Schilthuizen M, Lombaerts M: Population structure and levels of gene flow in the Mediterranean land snail Albinaria corrugata (Pulmonata: Clausiliidae). Evolution. 1994, 48: 577-586. 10.2307/2410470.View ArticleGoogle Scholar
- Goodacre SL: Population structure, history and gene flow in a group of closely related land snails: genetic variation in Partula from the Society Islands of the Pacific. Mol Ecol. 2002, 11: 55-68. 10.1046/j.0962-1083.2001.01422.x.View ArticlePubMedGoogle Scholar
- Ursenbacher S, Alvarez C, Armbruster G, Baur B: High population differentiation in the rock-dwelling land snail (Trochulus caelatus) endemic to the Swiss Jura Mountains. Conserv Genet. 2010, 11: 1265-1271. 10.1007/s10592-009-9956-3.View ArticleGoogle Scholar
- Wakeley J: The effects of subdivision on the genetic divergence of populations and species. Evolution. 2000, 54: 1092-1101.View ArticlePubMedGoogle Scholar
- Sokolov EP: An improved method for DNA isolation from mucopolysaccharide-rich molluscan tissues. J Molluscan Stud. 2000, 66: 573-575. 10.1093/mollus/66.4.573.View ArticleGoogle Scholar
- Posada D, Crandall K: MODELTEST: testing the model of DNA substitution. Bioinformatics. 1998, 14: 817-818. 10.1093/bioinformatics/14.9.817.View ArticlePubMedGoogle Scholar
- Jobb G: TREEFINDER version of November 2007. Distributed by the author. Munich. 2007, [http://www.treefinder.de]Google Scholar
- Jobb G, von Haeseler A, Strimmer K: TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics. BMC Evol Biol. 2004, 4: 18-10.1186/1471-2148-4-18.PubMed CentralView ArticlePubMedGoogle Scholar
- Felsenstein J: Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985, 39: 783-791. 10.2307/2408678.View ArticleGoogle Scholar
- Swofford DL: PAUP*: Phylogenetic analysis using parsimony *and other methods, version 4.0b10. Sunderland, Massachusetts. Sinauer Associates. 2002Google Scholar
- Ho JWK, Adams CE, Lew JB, Matthews TJ, Ng CC, Shahabi-Sirjani A, Tan LH, Zhao Y, Easteal S, Wilson SR, Jermiin LS: SeqVis: Visualization of compositional heterogeneity in large alignments of nucleotides. Bioinformatics. 2006, 22: 2162-2163. 10.1093/bioinformatics/btl283.View ArticlePubMedGoogle Scholar
- Phillips MJ, Penny D: The root of the mammalian tree inferred from whole mitochondrial genomes. Mol Phyl Evol. 2003, 28: 171-185. 10.1016/S1055-7903(03)00057-5.View ArticleGoogle Scholar
- Galtier N, Gouy M: Inferring pattern and process: maximum-likelihood implementation of a nonhomogeneous model of DNA sequence evolution for phylogenetic analysis. Mol Biol Evol. 1998, 15: 871-879.View ArticlePubMedGoogle Scholar
- Boussau B, Gouy M: Efficient likelihood computations with nonreversible models of evolution. Syst Biol. 2006, 55: 756-768. 10.1080/10635150600975218.View ArticlePubMedGoogle Scholar
- Shimodaira H: An approximately unbiased test of phylogenetic tree selection. Syst Biol. 2002, 51: 492-508. 10.1080/10635150290069913.View ArticlePubMedGoogle Scholar
- Shimodaira H, Hasegawa M: CONSEL: for assessing the confidence of phylogenetic tree selection. Bioinformatics. 2001, 17: 1246-1247. 10.1093/bioinformatics/17.12.1246.View ArticlePubMedGoogle Scholar
- Buntjer J: PhylTools version 1.32. Laboratory of Plant Breeding, Wageningen University and Research Centre, Wageningen. 2001, [http://www.plantbreeding.wur.nl/UK/software_PhylTools.html]Google Scholar
- Felsenstein J: PHYLIP (Phylogeny Inference Package). Version 3.66. Distributed by the author, Department of Genetics, University of Washington, Seattle. 2005, [http://evolution.genetics.washington.edu/phylip.html]Google Scholar
- Bryant D, Moulton V: Neighbor-Net: an agglomerative method for the construction of phylogenetic networks. Mol Biol Evol. 2004, 21: 255-265. 10.1093/molbev/msh018.View ArticlePubMedGoogle Scholar
- Huson DH, Bryant D: Application of phylogenetic networks in evolutionary studies. Mol Biol Evol. 2006, 23: 254-267. 10.1093/molbev/msj030.View ArticlePubMedGoogle Scholar
- Mallet J: A species definition for the modern synthesis. Trends Ecol Evol. 1995, 10: 294-299. 10.1016/0169-5347(95)90031-4.View ArticlePubMedGoogle Scholar
- Mayr E: Animal species and evolution. 1963, Cambridge, MA: Belknap pressView ArticleGoogle Scholar
- Hennig C, Hausdorf B: The prabclus package version 2.1-1. Department of Statistical Science, University College London, London. 2008, [http://cran.r-project.org/]Google Scholar
- R Development Core Team: R: A language and environment for statistical computing. 2007, R Foundation for Statistical Computing, Vienna, [http://www.r-project.org]Google Scholar
- Kruskal J: Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika. 1964, 29: 1-27. 10.1007/BF02289565.View ArticleGoogle Scholar
- Link W, Dixkens C, Singh M, Schwall M, Melchinger AE: Genetic diversity in European and Mediterranean faba bean germ plasm revealed by RAPD markers. Theor Appl Genet. 1995, 90: 27-32. 10.1007/BF00220992.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.