- Research article
- Open Access
Evolutionary, structural and functional relationships revealed by comparative analysis of syntenic genes in Rhizobiales
- Gabriela Guerrero†1,
- Humberto Peralta†1,
- Alejandro Aguilar1,
- Rafael Díaz1,
- Miguel Angel Villalobos1,
- Arturo Medrano-Soto2 and
- Jaime Mora1Email author
© Guerrero et al; licensee BioMed Central Ltd. 2005
Received: 18 July 2005
Accepted: 17 October 2005
Published: 17 October 2005
Comparative genomics has provided valuable insights into the nature of gene sequence variation and chromosomal organization of closely related bacterial species. However, questions about the biological significance of gene order conservation, or synteny, remain open. Moreover, few comprehensive studies have been reported for rhizobial genomes.
We analyzed the genomic sequences of four fast growing Rhizobiales (Sinorhizobium meliloti, Agrobacterium tumefaciens, Mesorhizobium loti and Brucella melitensis). We made a comprehensive gene classification to define chromosomal orthologs, genes with homologs in other replicons such as plasmids, and those which were species-specific. About two thousand genes were predicted to be orthologs in each chromosome and about 80% of these were syntenic. A striking gene colinearity was found in pairs of organisms and a large fraction of the microsyntenic regions and operons were similar. Syntenic products showed higher identity levels than non-syntenic ones, suggesting a resistance to sequence variation due to functional constraints; also, an unusually high fraction of syntenic products contained membranal segments. Syntenic genes encode a high proportion of essential cell functions, presented a high level of functional relationships and a very low horizontal gene transfer rate. The sequence variability of the proteins can be considered the species signature in response to specific niche adaptation. Comparatively, an analysis with genomes of Enterobacteriales showed a different gene organization but gave similar results in the synteny conservation, essential role of syntenic genes and higher functional linkage among the genes of the microsyntenic regions.
Syntenic bacterial genes represent a commonly evolved group. They not only reveal the core chromosomal segments present in the last common ancestor and determine the metabolic characteristics shared by these microorganisms, but also show resistance to sequence variation and rearrangement, possibly due to their essential character. In Rhizobiales and Enterobacteriales, syntenic genes encode a high proportion of essential cell functions and presented a high level of functional relationships.
A huge amount of information has been obtained from sequencing projects. More than two hundred complete microbial genomes are available to date in public databases and sequencing of a similar number is in progress . Many questions remain unsolved. For example, what is the biological meaning, if any, of gene arrangement in the bacterial chromosome?
Changes in gene sequence and chromosomal rearrangements constitute the main sources of genomic variability. Nonsynonymous substitutions in the first or second nucleotides of the codon change the encoded residue and are thus a driving force of natural selection. Genomic studies in bacteria regarding synonymous and nonsynonymous substitution rates have been published elsewhere [2, 3]. Chromosomes show constraints on rearrangement and works dealing with that aspect were recently reviewed by Rocha ; he suggested that there is a balance between conservation and change in the organization of the chromosome.
The operon represents the first level of the gene organization. Neighboring genes, especially those in co-directional and in divergent orientation, represent a second organization level because they show a certain functional association revealed by genomic context analysis . Regarding comparisons among closely related species, the gene order conservation, or synteny, represents a third level of organization. Synteny depends on shared ancestry and inter- and intrachromosomal exchanges, and represents a higher relationship between taxa. It was suggested that physiologically important gene clusters could be positively selected, and synteny perhaps reveals the functional constraints of these genes . For the detection of synteny it is necessary first to determine the set of orthologous genes in pairs of organisms and recently an inverse method has proven useful for this .
Recombination/transposition events can easily disrupt synteny. Species of Buchnera and Corynebacterium have low levels of chromosomal rearrangements and lack recA and recBCD orthologs, respectively [8, 9], thus suggesting that recombination is an important factor for loss of synteny. Synteny studies have focused in short gene clusters in eukaryotes [10, 11] while whole chromosome comparative analysis has been done in bacteria and archaea [12–18]. For example, there is a striking conservation between the chromosomes of Escherichia coli and Salmonella typhimurium , and also between those of Brucella melitensis, B. suis and Mesorhizobium loti . However, the synteny analyses for alpha proteobacteria such as those reported in the genome sequence determinations of Brucella melitensis, Sinorhizobium meliloti and Agrobacterium tumefaciens are highly schematic [20–22]. A thorough analysis of Rhizobiales genomes would determine if the key set of genes covering the most important metabolic functions in these organisms are syntenic.
Another factor affecting chromosomal rearrangement is horizontal gene transfer (HGT) which occurs in bacteria [23, 24], but estimating its impact on genome organization has proved a daunting task for two reasons. First, the reliability of compositional methods to detect HGT events has been questioned [25, 26]. Second, phylogenetic methods, albeit more reliable, are not always applicable and can easily be misleading without proper care. The results of a recently published analysis suggest that codon usage compatibility between alien genes and recipient genomes  is a prerequisite for successful HGT events. This premise has been supported by other evidence [28, 29].
Among the most accepted methods to deduce functional relationships of proteins are phylogenetic profile , gene neighboring , and the Rossetta stone method . These methods can give additive information about metabolic networks existing in organisms . ProLinks is a program based on these methods with an extensive library of predicted functional interactions from 83 genomes . Von Mering et al. , also applying these approaches with the STRING program, found global modularities in functional protein networks. A question remains about whether functional linkage differences exist in syntenic and non-syntenic gene clusters.
Rhizobiales is a prokaryotic order belonging to the alpha proteobacteria subdivision; some rhizobial species are intensively studied for their nitrogen-fixing ability when in symbiosis with leguminous plants. The order comprises both plant symbionts and plant and animal pathogens such as Rhizobium, Agrobacterium and Brucella, respectively. In rhizobia, genes responsible for the symbiotic interaction are commonly found on large plasmids or incorporated in a particular stretch of the chromosome called the symbiotic island [36–38]. The physiological potential of the rhizobial chromosome allows cell survival under different conditions. For example, an A. tumefaciens strain containing the symbiotic plasmid from Rhizobium etli induced nodules on legume plants [39, 40], and conversely, an S. meliloti derivative strain with Ri, the rhizogenic induction plasmid, formed root mats on alfalfa plants . Additionally, there are similarities in the parasitic/symbiotic strategies employed by species of the Rhizobiales . Also, it is possible to find diverse life-styles among the members of Enterobacteriales order (gamma proteobacteria): for example, Buchnera is an obligate aphid symbiont; E. coli and S. typhimurium are common gut inhabitants in mammals; and Shigella flexneri, Yersinia pestis and Erwinia carotovora are pathogens, either for animals or plants .
The complete genome sequences of seven species of Rhizobiales were available by 2004, namely S. meliloti , Mesorhizobium loti , Bradyrhizobium japonicum , A. tumefaciens [22, 45], B. melitensis , B. suis  and Rhodopseudomonas palustris . Although their genomes show a certain degree of conservation, variability corresponding to their evolutionary divergence points, microbial life styles and ecological niches was also found. Comparative genomics has captured the attention of researchers as a way of achieving a better understanding of the molecular basis underlying phenomena such as symbiosis and pathogenesis.
We classified the genes of several rhizobial species in order to gain a comprehensive insight into chromosomal conservation and genome rearrangement. Conserved genes among these species can reveal phylogenetic relationships, but also show metabolic strategies useful in understanding the niche diversity in which these organisms usually grow. In particular, syntenic/non-syntenic genes among these species were analyzed in terms of their sequence identity/similarity, physical characteristics of the encoded products and functional relationships among them. Additionally, in order to find more general trends, we compared these results with an analysis performed on genomes belonging to the Enterobacteriales.
Approach, strategy and outline
Our main objective was to enhance our understanding of the functional meaning of the gene arrangement on the bacterial chromosome, taking as examples some genomes from the Rhizobiales and Enterobacteriales. We consider that gene neighboring is not a random trait and gives an adaptive advantage to the cell because the proteins produced are likely to perform related functions. Our belief is that the coordinated expression of genes, organized on the chromosome either as operons or clusters, permits the correct integration of metabolic functions.
Our approaches were: i) to obtain a comprehensive gene classification, applicable to each of the species analyzed and suitable to make comparisons among them, and ii) to detect specific gene characteristics (if any) of each of the classes. In the first approach we identified orthologs among chromosomes, defined those that were syntenic, those in a different replicon, and those that were species-specific. For the second approach, we analyzed gene/protein sequences for identity, calculated the horizontal gene transfer rate for each class and the predicted molecular weight and isoelectric point of the peptides, and inferred the functional relationship in syntenic or non-conserved chromosomal regions. The results are presented in the following order: 1) Synteny in Rhizobiales (gene classification of Rhizobiales; gene organization and microsyntenic region formation; synteny and insertion sequences, horizontal gene transfer and codon usage), 2) Synteny in Enterobacteriales, 3) Sequence analysis of the chromosomal predicted orthologs, 4) Physical characteristics of the translated products of syntenic genes, and 5) Functional roles and linkage of chromosomal predicted orthologs. S. meliloti was taken as reference organism for the comparisons with each A. tumefaciens, M. loti, B. melitensis and E. coli. E. coli was used as base to compare with S. typhimurium, E. carotovora and S. meliloti.
1) Synteny in Rhizobiales
Gene classification of Rhizobiales
Gene classification of Rhizobiales.
In comparison with the Sinorhizobium meliloti chromosome
Genes in chromosome (length in Mb)
Chromosomal orthologs (% of chr. genes)
Syntenic genes (% of chr. orthologs)
Nonsyntenic genes (% of chr. orthologs)
Syntenic genes in regions (% of synt. genes)
Plasmidic homologs (% of chr. genes)
Homologs in rest of Rhizobiales (% of chr. genes)
Species specific genes (% of chr. genes)
Operons (except homologs in rest of Rhizobiales)
Syntenic operons (% of operons)
Nonsyntenic operons (% of operons)
Plasmidic operons (% of operons)
Species specific operons (% of operons)
Mixed operons (% of operons)
We assessed the chromosomal genes with conserved order or synteny (see Methods). Syntenic genes represented about 70–80% of chromosomal predicted orthologs (see Table 1). That is, the conserved chromosomal order of these genes seems favored. The remarkable synteny level is highlighted by a group of 1038 common syntenic genes in all these species. Non-syntenic genes represented from 17% to 35% of the chromosomal predicted orthologs in these organisms. Only 98 non-syntenic genes were common in the four Rhizobiales.
In this way, all chromosomal genes were assigned to the categories mentioned, as shown in Figure 1 for the comparison of S. meliloti with A. tumefaciens. This approach gives a panoramic view about shared and unshared genes with other rhizobial species. Additional file 1 shows the categories for the other comparisons.
A schematic representation of rhizobial chromosomes was obtained relative to the classification of the genes. A high proportion of chromosomal predicted orthologs of the analyzed species were syntenic; however, in the A. tumefaciens linear, M. loti and B. melitensis I chromosomes, plasmidic homologs and species-specific genes were particularly abundant.
Gene organization in operons and microsyntenic regions in Rhizobiales
A relationship between predicted orthologs and operons was also explored. For S. meliloti (in comparison with A. tumefaciens), one half of the syntenic genes was found organized in 303 syntenic operons, a half of the total predicted operons (606); similar proportions were found for the A. tumefaciens circular and B. melitensis I chromosomes. In the A. tumefaciens linear, B. melitensis II and M. loti chromosomes, the proportion was about 21% (Table 1). The first two can be considered as accessory chromosomes and the last is the largest chromosome of the analyzed species. Non-syntenic genes were found organized in operons in a very small proportion in all analyzed Rhizobiales (Table 1).
A relevant level of operon conservation was found. In the comparison of S. meliloti with A. tumefaciens, 50% of the syntenic genes organized in operons were in identical operons, and taken together with those in similar operons (differing by one or two genes), 82% of total syntenic genes were in conserved operons. Similar proportions were found for the other comparisons (data not shown).
The operons formed with plasmidic homologs constituted a small fraction of the predicted operons (Table 1). Also, a reduced proportion was found for species-specific operons, except for Ml. Finally, mixed operons were present in a higher amount and ranged from 37 to 64% of the predicted operons (Table 1). This category revealed the highest rate of chromosomal rearrangements in these organisms. The mixed operons contained 17% of the syntenic genes, on average.
When At-C syntenic genes were compared with Sm chromosome, a high level of colinearity was found (see Additional file 4 panel a). In the case of Ml, the synteny was disrupted possibly due to a conflicting annotation; for Bm-I, an inverse colinearity was obtained, possibly by oriC inversion relative to Sm (see Additional file 4 panels b and c, respectively). In regard to the rearrangement of microsyntenic regions, an extensive chromosomal colinearity was observed in long tracts. For example, in the comparison of Sm and At circular chromosomes, 93 microsyntenic regions were colinear, 24 almost colinear, 18 with drastic changes, and 13 inverted. The schematic representation is shown in Figure 3 panel a, lower part. In the comparison with the At linear and Bm-II chromosomes, highly rearranged structures were found (Figure 3 panel a, upper part and panel c, upper part). Rearrangement of microsyntenic regions on the chromosomes of M. loti (Figure 3 panel b) and B. melitensis I (Figure 3 panel c, lower part), showed a high level of colinearity when the conflicting annotation and the oriC inversion, respectively, were modified (see Methods).
Syntenic genes were organized mainly at two levels: operons and microsyntenic regions. Syntenic operons were as abundant as mixed operons. Extensive blocks of chromosomal colinearity were found, despite rearrangements.
Synteny and insertion sequences, horizontal gene transfer and codon usage
The mobile elements play an important role in chromosomal rearrangement. To determine how these elements were dispersed among microsyntenic regions, insertion sequence (IS) and transposase locations were analyzed. The S. meliloti chromosome contains 51 IS and 68 transposases belonging to diverse families . Of the total transposases, 40 were found with homologs in plasmids, 20 were common with other rhizobial chromosomes, 6 were denoted as species-specific and 2 were common with the A. tumefaciens linear chromosome. Only 14 pairs of IS/transposases (27% of the total) were found inside microsyntenic regions.
Horizontal gene transfer prediction for Rhizobiales.
A. Gene class
S. meliloti genes
B. Gene class
A. tumefaciens genes
C. Gene class
M. loti genes
D. Gene class
B. melitensis genes
2) Synteny in Enterobacteriales (gamma proteobacteria)
To assess the adequacy of applying our synteny analysis to another bacterial clade, we chose two members from the Enterobacteriales (gamma proteobacteria), the closely related E. coli and S. typhimurium genomes and defined their orthologous genes. These organisms contained 3092 predicted orthologs and 95% of them (2943) were syntenic. An extensive chromosomal colinearity with few rearrangements was found (data not shown).
Also, we chose a more phylogenetically distant species, Erwinia carotovora subsp. atroseptica to compare with E. coli. Their genomes comprise 4254 and 4477 genes, respectively. They shared 2477 orthologous genes and these represented about half the total genes in each chromosome. In the detection of synteny between these organisms, 1993 genes (80.4% of the orhologs) fulfilled our requirement and the rest, 484, were classified as non-syntenic. When the genes were assigned to microsyntenic regions, 230 regions were found and contained 92.8% (1849) of total syntenic genes and the rest, 7.2% (144 genes), were detected in the non-conserved tracts (see Additional file 5). 172 non-orthologous genes also formed part of microsyntenic regions. The 230 non-conserved regions contained 2233 genes.
To define the conservation of orthology and synteny of Enterobacteriales genomes in comparison with the Rhizobiales, we compared the S. meliloti and E. coli chromosomes. We found 777 predicted orthologs between them, a proportion that represents only a third of the genes shared in the Rhizobiales. By visual examination no synteny was apparent between the Sm and E. coli chromosomes (see Additional file 4 panel d). With an algorithm, only 198 syntenic genes were assigned to 65 microsyntenic regions.
3) Sequence analysis of the chromosomal predicted orthologs
A comparison of E. coli and S. typhimurium genomes (belonging to the Enterobacteriales) was performed. A very asymmetric distribution curve with a tendency to high identity levels for syntenic products was obtained (Figure 5 panel b); it remarkably resembled that obtained in the comparisons of syntenic products of the Rhizobiales. Interestingly, in the comparison of S. meliloti with E. coli both chromosomal non-syntenic and syntenic products had asymmetric curves with strong tendency to low identity levels (Figure 5, panel c).
To determine the meaning of sequence differences we analyzed the translated syntenic product ArgC, which participates in the arginine biosynthetic pathway. The alignment presented in Additional file 7 panel a shows 111 positions with identical residues and a range from 33 to 113 different residues, particular for each of the species compared. However, sequences from Brucella melitensis and Brucella suis showed only one difference between them (not shown). Synteny is almost absolute in these organisms (Additional file 4 panel e). Changing residues (possibly species signatures) were dispersed along the sequences and varied according to the identity level.
To determine more comprehensively sequence differences and similarities in Rhizobiales (alpha proteobacteria) and Enterobacteriales (gamma proteobacteria), we selected five organisms from each: S. meliloti, A. tumefaciens, M. loti, B. melitensis and R. palustris for the first and E. coli, S. typhimurium, S. flexneri, Buchnera sp. and E. carotovora for the last. We chose some syntenic products from the arginine biosynthetic pathway, namely ArgB, ArgC, ArgD, ArgF, ArgG, and ArgH. The alignment belonging to ArgC is shown in Additional file 7 panel b. As can be observed, there are sequence identities and differences among species from the same order ("species signatures"), but also similarities between species of different orders, albeit at smaller level. Additionally, we found an interesting pattern: proteins from Enterobacteriales showed an almost uniform level of sequence identity and similarity (about 50 and 70%, on average, respectively), however, sequences from Rhizobiales showed a clear increasing tendency, from 25 to 62% in identity, and from 44 to 81% in similarity (see Additional file 7 panel c). These different profiles possibly are related with the particular conditions of the niches occupied by these organisms.
4) Physical characteristics of the translated syntenic genes
Since the functional categories mentioned above for sectors are known to often interact with the cell membrane, a membrane prediction for all syntenic products from the S. meliloti-A. tumefaciens circular chromosomes comparison was assessed. Strikingly, 790 syntenic products were predicted to contain membranal segments. This amount represents almost all membranal proteins coded in the S. meliloti chromosome, considering that bacterial genomes have 18–28% membranal proteins [49–51]. About 70% of predicted membranal proteins with assigned function belonged to transport, energy generation, post-translational modification, cofactor synthesis, amino acid metabolism and central intermediary metabolism categories (Additional file 11). There are reports about membrane-interacting proteins with functions such as amino acid and cofactor biosynthesis and central intermediary metabolism [52–56].
To determine whether the charged amino acid residues were clustered in proteins from the sectors described above, we selected several proteins from each. The residues determining radical changes in pI were observed scattered along the sequences (data not shown).
5) Functional roles and linkage of the chromosomal predicted orthologs
The ProLinks program was used to determine how the chromosomal predicted orthologs are functionally related. Functional links were calculated for all genes in the S. meliloti and E. coli chromosomes and then correlated to their neighbors. For the S. meliloti-A.tumefaciens comparison, the microsyntenic regions presented, on average, 3.68 connections per node (a protein in a network), almost twice the value obtained in the non-conserved regions (2.06). From 205 microsyntenic regions, 104 had functional networks; in the case of non-conserved regions, only 35 presented networks. Networks with less than 6 connections were omitted. The networks of microsyntenic regions presented 1057 syntenic genes and from these 795 (75.2%) were organized in operons. 99 non-syntenic genes were in the networks of the non-conserved tracts, and only 56 (56.5%) were in operons. In the case of the synteny comparison between E. carotovora-E. coli (Enterobacteriales), 230 microsyntenic regions were obtained and from these, 161 presented functional networks, with a connectivity average of 5.33 connections/node. The non conserved regions with networks were 71 with a connectivity average of 2.04 connections/node. From 1497 syntenic genes in the networks, 1106 (73.8%) formed part of operons. Network connectivity obtained in S. meliloti and E. coli is shown in the graph of Additional file 14 (panels a and b, respectively). There is a striking difference in connectivity level in the networks from syntenic (gray bars) or non-conserved regions (black bars). The connectivity levels in syntenic vs non-conserved regions in both organisms were similar using the STRING program (data not shown).
The comparative genomic analysis reported here was useful in finding interesting gene properties. Orthologs with conserved replicon and neighborhood were the principal component of the chromosomes. Compared with non-syntenic genes, syntenic ones had higher identity levels, lower horizontal gene transfer (HGT) rates, showed strongly organized structures as operons and microsyntenic regions and a relative absence of mobile elements. Thus, the syntenic genes can be considered as the chromosomal backbone of the order. Plasmidic homologs were scattered on the chromosomes, and higher HGT rate and linkage to transposases support their extrachromosomal origin. Species-specific genes had the lowest codon richness index, and possibly were acquired in the evolutionary history of each of the species.
In this way, a rhizobial chromosomal origin can be envisioned. The chromosomal orthologs were the gene set derived from the common cenancestor. From these, syntenic genes conserved a relative chromosomal order (and operonic organization) and encode the essential functions of the cell; non-syntenic genes lost the clustering and possibly some came from HGT events. The plasmidic homologs were obtained possibly by mobilization throughout replicons, a nonrare process in the rhizobial phylogenetic branch. The species-specific genes represent the particular gene set of the species and are the most intriguing group due to their unknown functional roles and origin. Work with members of the last group will help to define traits not shared with other species.
The Rhizobiales species analyzed showed a striking proportion of orthologous genes, mainly chromosomal syntenic; non-syntenic genes were found in lower proportion. A large fraction of the first class was common in the four organisms. Syntenic genes had a strong tendency to form operons and almost all were clustered in microsyntenic regions. Additionally, these operons were conserved in pairs of organisms. Therefore, a strong restriction for chromosomal rearrangement is visible. Given that these organisms cover a wide spectrum of environmental distributions, from plant rhizosphere to animal host, the conserved chromosomal tracts may be important to determine the metabolic properties common to the order. Similar results were observed in the Enterobacteriales comparison. In a recent report, using computational inference, Boussau et al.  proposed a common ancestral set of about 3000 genes for proteobacterial genomes.
Although rhizobial genomes shared common traits, important differences were also observed. For example, abundance of plasmidic homologs and species-specific genes in A. tumefaciens (linear), B. melitensis II and M. loti chromosomes confirmed their complex evolutionary histories [20, 22, 59, 60]. The M. loti chromosome presents intensive incorporation of foreign genes by horizontal transfer, such as those belonging to the symbiotic island . In regard to species-specific genes, a large number presented low codon richness index. This category will be reduced with incorporation of other rhizobial genomes into the databases. For example, R. leguminosarum biovar viciae 3841, R. tropici PRF81, R. sp. ANU265 and R. etli CFN42 genomes soon will be available . Mobilization elements participate in chromosomal rearrangement and are abundant in Rhizobiales. These elements can decompose the microsyntenic regions where they are. In this way, all synteny approaches consist of snapshots in chromosomal evolution.
From the comparison of sequence identities among chromosomal predicted orthologs in the rhizobial species analyzed, an interesting characteristic was the differential distribution curves obtained. The asymmetric curves of syntenic products, deviated to high identity levels (with peaks at 70–75%, Figure 5 panel a and Additional file 6 panels a and b), possibly reveal their essentiality and can be compared with those from gamma proteobacteria with high identity (with peak at 95%, Figure 5 panel b). Conversely, non-syntenic genes had curves with lower sequence identity levels reflecting a higher functional versatility. In the case of S. meliloti-E. coli comparison, the identical curves for syntenic and non-syntenic genes with the majority at very low identity values (35%, Figure 5 panel c) reflect their greater phylogenetic distances.
The high identity of syntenic genes indirectly reveals their essential character; for the non-syntenic genes the low identity could represent adaptability to the ecological niche of the species. The identity relatedness in the syntenic genes among rhizobial species (Figure 6 panel a) revealed a cohesively evolved group; additionally the sequence differences were reflected in the theoretical pI plots of the proteins encoded by these genes, with a majority in the diagonal and the rest in sectors with strong pI deviation. Species signatures of the sequences (see Additional file 7) showed a differential level of changed residues and these could represent functional adaptation to a niche; this proposal is supported with the almost identical sequences of B. melitensis and B. suis. On the other hand, invariant peptides perhaps contribute to structural conformation . Experiments in progress in our lab will determine the validity of our proposal. Recently, a conservative change which altered the function of the transcriptional regulator BosR , and pathogenic differences in enteric bacteria due to the expression of PmrD regulators with divergent sequences  were reported.
A typical trait of Enterobacteriales is its pathogenic character, and this means a very intimate, frequent contact with their hosts and almost constant, homogeneous conditions: the extreme case is that of Buchnera sp., an aphid-obligate symbiont. In contrast, Rhizobiales are commonly found in the soil in saprophytic living style, and occasionally they associate with hosts, and therefore face more heterogeneous, variable environments. These features were reflected in the sequence comparisons of proteins from the arginine biosynthetic pathway (see Additional file 7). The syntenic gene organization was different in the Rhizobiales compared with Enterobacteriales, however it is important to note the high orthology and synteny degrees between members of each clade, the essentiality of functions covered by the syntenic genes and the high functional linkage in the microsyntenic regions in each chromosome (see below). From the previous observations we can obtain a general trend: bacterial clades present a particular chromosomal gene arrangement and such plasticity possibly was selected in relation to the niches occupied by these organisms.
As have others, some time ago we found a proteomic bimodal distribution when MW and pI were plotted for rhizobial gene products. The main fraction was located at acid and basic pI's, in a shape resembling butterfly's wings and the body, at neutral pI, presenting a low number of proteins. Recently Knight et al. , in a vast genomic study, reported similar results and additionally, related proteome similarities to shared metabolic features. In this respect, we plotted pI of syntenic products from pair comparisons of studied species in order to detect proteomic similarities in these organisms. It was possible to analyze the pI variability in each of the common syntenic products; however, it is necessary to carry out experimentation to determine the biological role of pI variation. The overrepresented membranal fraction in the syntenic products must be further explored to determine whether they respond to different extracellular/intracellular signals. A rapid evolution for the membranal proteomic fraction was suggested in the same study .
The high proportion of syntenic operons in the microsyntenic regions and the duplicated connections per gene in the network-forming regions (in regard to non-conserved regions), in Rhizobiales but also in Enterobacteriales, supported the functional linkage and interaction of these genes in the conserved tracts (see Additional file 14), and this is a factor which could help to define the role of selective pressure in maintaining the gene order.
Functional characterization of the predicted orthologs deepened our understanding of their cellular roles. In both Rhizobiales and Enterobacteriales, a clear essentiality was observed for chromosomal syntenic genes, agreeing with their sequence restrictions for change. Non-syntenic genes, on the other hand, appeared abundant in functions granting metabolic versatility to the cell. By calculating nonsynonymous/synonymous substitution rates, other authors have shown that in bacteria most conserved genes cover the essential functions of the cell .
Our synteny analysis defined a multi-level gene organization in the bacterial chromosome. Restriction of sequence variation in these genes, with clear essential functional roles, appeared extended to the conservation of chromosomal arrangement. In this way, synteny possibly has an important biological significance in these organisms.
Identification of orthologs
Available genome sequences of the fast growing Rhizobiales, S. meliloti 1021 [accession number GenBank: NC_003047], A. tumefaciens C58 (Cereon) [GenBank: NC_003062 and NC_003063], M. loti MAFF303099 [GenBank: NC_002678] and B. melitensis 16 M [GenBank: NC_003317 and NC_003318] were obtained from the Genome division of the NCBI Entrez system . Genomes of E. coli K12, S. typhimurium LT2 and Erwinia carotovora subsp. atroseptica SCRI1043 [accession numbers GenBank: NC_00913NC_003197, and NC_004547], were used for synteny and functional analysis. The genome of B. suis 1330 [accession numbers GenBank: NC_004310 and NC_004311 for chromosomes I and II, respectively] was used for synteny and sequence identity analysis. Genomes of Bradyrhizobium japonicum USDA110 [accession number GenBank: NC_004463] and Rhodopseudomonas palustris CGA009, also belonging to the alpha proteobacteria, were not considered in the main analysis because of their more distant phylogenetic relationship with the fast-growing rhizobia. To obtain a comprehensive view of shared genes among Rhizobiales, we differentiated genes by orthology and their presence in the same or different replicons in the analyzed species. Chromosomal orthologs were assigned by the best bidirectional hit between pairs of organisms, using the Fasta34 program . Unidirectional best hits (homologs) were considered to cover complete chromosomal gene number (see Results) and for detection of chromosomal genes with plasmidic homologs. Parameters were: an identity of at least 50%, overlapping by at least in 150 nt and an expectance (E) score of <10-3. The base organism for comparisons was S. meliloti 1021. GenBank accession numbers of proteins used for alignments in the order ArgB, ArgC, ArgD, ArgF, ArgG and ArgH, were as follows. Rhizobiales. R. palustris: NP_945982.1, NP_947833.1, NP_950107.1, NP_950106.1, NP_945745.1, NP_950077.1; B. melitensis: NP_541250.1, NP_540088.1, NP_540538.1, NP_540537.1, NP_540787.1, NP_539004.1; M. loti: NP_105609.1, NP_108547.1, NP_106269.1, NP_106270.1, NP_105253.1, NP_104594.1; A. tumefaciens: NP_353412.1, NP_354256.1, NP_353456.1, NP_353457.1, NP_355604.1, NP_357013.1; S. meliloti: NP_384545.1, NP_385346.1, NP_384623.1, NP_384624.1, NP_387315.1, NP_386753.1. Enterobacteriales. Buchnera sp.: NP_239886.1, NP_239885.1, NP_240341.1, NP_240186.1, NP_239887.1, NP_239888.1; E. coli: NP_418394.1, NP_418393.1, NP_417818.1, NP_414807.1, NP_417640.1, NP_418395.1; E. carotovora: YP_048320.1, YP_048319.1, YP_052152.1, YP_048510.1, YP_048232.1, YP_048321.1S. typhimurium: NP_457937.1, NP_457938.1, NP_458434.1, NP_458882.1, NP_457671.1, NP_457936.1; S. flexneri: NP_838925.1, NP_838926.1, NP_839526.1, NP_839629.1, NP_838682.1, NP_838924.1. All data sets are available on request.
Procedures for detection of syntenic genes, microsyntenic region formation and operon similarity
To consider chromosomal orthologs as syntenic among pairs of organisms, at least two genes must remain contiguous in both chromosomes. Microsyntenic region formation and extension fulfilled the following criterion: a pair of predicted orthologs separated from at least one other by no more than three genes (from the rest of categories). The minimal region was formed by a stretch containing three syntenic genes. Operon prediction was performed as reported by Moreno-Hagelsieb and Collado-Vides . Rearrangements were graphed using initial positions of the microsyntenic regions in each chromosome. The syntenic regions of M. loti and B. melitensis I chromosomes were graphed so as to increase the colinearity in these replicons. The M. loti chromosome was segmented into two halves at 3.5 Mb position and the fragment covering from 3.5 to 7.0 Mb was located in the first position and then both halves were aligned with microsyntenic regions of S. meliloti. In the case of Brucella chromosome I, the origin was inverted. Graphs were obtained using the GenVision program (DNAStar Inc., Madison, WI). For operon similarity calculation, a limit of three different genes in each operon was allowed. All data sets are available on request.
Detection of horizontal gene transfer
Prediction of horizontally transferred genes in the Rhizobiales genomes was performed using the method described by Medrano-Soto et al. . Briefly, it on based in similar gene length, maximum global protein identity, conflicting phylogenies and codon usage of xenologous genes. In the case of A. tumefaciens, this rate was calculated using sequences and annotation obtained from the U. of Washington Sequencing Project . Species-specific (or orphan) genes were not considered by two reasons: (1) the lack of orthologs in other genomes precluded phylogenetic analysis, and (2) impossibility of correlating these genes to a synteny.
Sequence comparison of chromosomal orthologs in Rhizobiales
The identity of peptidic sequences of chromosomal predicted orthologs were used to graph the distribution curves. The asymmetry of distribution curves, or skewness, was calculated by the asymmetry coefficient of Pearson (g1) as described elsewhere . To correlate nucleotide sequence identities of the chromosomal predicted orthologs in the four rhizobial genomes, the gene identity of the predicted orthologs from the S. meliloti-M. loti comparison was graphed in progressive order. Then, corresponding predicted orthologs of the other comparisons were located at their corresponding identity percentages. Correlation coefficient values were calculated by the Pearson method. Plasmidic homologs and species-specific genes were not graphed because they have no counterparts in the pairs of analyzed genomes.
Theoretical proteome and transmembranal protein prediction
Theoretical proteomes were obtained by calculating molecular weight and isoelectric point for each translated chromosomal predicted ortholog. Both parameters were estimated with the pI/MW prediction tool of the Laquip Proteomic Team page . To determine the set of orthologs coding proteins predicted to interact with the cell membrane, we used the TMAP program, version 46 , available in the EMBOSS package (European Molecular Biology Open Software Suite, ). Correlation coefficient values were obtained with the Pearson method. Alignments were performed with ClustalW .
Functional classification of chromosomal predicted orthologs
Chromosomal predicted orthologs were assigned to the functional classes used for Agrobacterium tumefaciens C58 in the U. of Washington genome report . For the distribution of functions coded in the microsyntenic regions along the S. meliloti chromosome, classes were assigned into Operational (including amino acid, fatty acid, carbohydrate and nucleotide metabolism, energy generation, central intermediary metabolism, transport and cofactor synthesis), Informational (DNA metabolism, transcription, translation and regulatory functions), and Cellular processes (cell envelope, cell division, secretion and chemotaxis) superclasses. This grouping, except for Cellular processes, is similar to that of Rivera et al. . For functional relationship inference, the ProLinks  and STRING databases  were used with permission. The confidence levels used were 0.6 and 0.9, respectively. Resulting networks with ProLinks with less than 6 links were omitted from the count. Networks were constructed with the Pajek Program (written by A. Vlado), version 1.02, available in the web . Assignment of metabolic pathways was performed using the MetaCyc database , with permission.
This work was partially supported by the grant 028 from the Consejo Nacional de Ciencia y Tecnología-México.
We wish thank to Gabriel Moreno-Hagelsieb for providing the operon prediction, César Bonavides for help with metabolic pathway assignment, Rafael Palacios, Michael Dunn and Yolanda Mora for manuscript review, and an anonymous referee for valuable suggestions.
- Genomes Online Database. [http://www.genomesonline.org]
- Itoh T, Martin W, Nei M: Acceleration of genomic evolution caused by enhanced mutation rate in endocellular symbionts. Proc Natl Acad Sci U S A. 2002, 99 (20): 12944-12948. 10.1073/pnas.192449699.PubMed CentralView ArticlePubMedGoogle Scholar
- Jordan IK, Rogozin IB, Wolf YI, Koonin EV: Microevolutionary Genomics of Bacteria. Theoretical Population Biology. 2002, 61 (4): 435-10.1006/tpbi.2002.1588.View ArticlePubMedGoogle Scholar
- Rocha EP: Order and disorder in bacterial genomes. Curr Opin Microbiol. 2004, 7 (5): 519-527. 10.1016/j.mib.2004.08.006.View ArticlePubMedGoogle Scholar
- Korbel JO, Jensen LJ, von Mering C, Bork P: Analysis of genomic context: prediction of functional associations from conserved bidirectionally transcribed gene pairs. Nat Biotechnol. 2004, 22 (7): 911-917. 10.1038/nbt988.View ArticlePubMedGoogle Scholar
- Tamames J: Evolution of gene order conservation in prokaryotes. Genome Biol. 2001, 2 (6): R0020-10.1186/gb-2001-2-6-research0020.View ArticleGoogle Scholar
- Zheng XH, Lu F, Wang ZY, Zhong F, Hoover J, Mural R: Using shared genomic synteny and shared protein functions to enhance the identification of orthologous gene pairs. Bioinformatics. 2005, 21 (6): 703-710. 10.1093/bioinformatics/bti045.View ArticlePubMedGoogle Scholar
- Nakamura Y, Nishio Y, Ikeo K, Gojobori T: The genome stability in Corynebacterium species due to lack of the recombinational repair system. Gene. 2003, 317 (1-2): 149-155. 10.1016/S0378-1119(03)00653-X.View ArticlePubMedGoogle Scholar
- Shigenobu S, Watanabe H, Hattori M, Sakaki Y, Ishikawa H: Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp. APS. Nature. 2000, 407 (6800): 81-86. 10.1038/35024074.View ArticlePubMedGoogle Scholar
- Cannon SB, McCombie WR, Sato S, Tabata S, Denny R, Palmer L, Katari M, Young ND, Stacey G: Evolution and microsynteny of the apyrase gene family in three legume genomes. Mol Genet Genomics. 2003, 270 (4): 347-361. 10.1007/s00438-003-0928-x.View ArticlePubMedGoogle Scholar
- Gualtieri G, Bisseling T: Microsynteny between the Medicago truncatula SYM2-orthologous genomic region and another region located on the same chromosome arm. Theor Appl Genet. 2002, 105 (5): 771-779. 10.1007/s00122-002-0949-6.View ArticlePubMedGoogle Scholar
- Bankier AT, Spriggs HF, Fartmann B, Konfortov BA, Madera M, Vogel C, Teichmann SA, Ivens A, Dear PH: Integrated mapping, chromosomal sequencing and sequence analysis of Cryptosporidium parvum. Genome Res. 2003, 13 (8): 1787-1799.PubMed CentralPubMedGoogle Scholar
- Bannantine JP, Zhang Q, Li LL, Kapur V: Genomic homogeneity between Mycobacterium avium subsp. avium and Mycobacterium avium subsp. paratuberculosis belies their divergent growth rates. BMC Microbiol. 2003, 3 (1): 10-10.1186/1471-2180-3-10.PubMed CentralView ArticlePubMedGoogle Scholar
- Buchrieser C, Rusniok C, Kunst F, Cossart P, Glaser P: Comparison of the genome sequences of Listeria monocytogenes and Listeria innocua: clues for evolution and pathogenicity. FEMS Immunol Med Microbiol. 2003, 35 (3): 207-213. 10.1016/S0928-8244(02)00448-0.View ArticlePubMedGoogle Scholar
- Eppinger M, Baar C, Raddatz G, Huson DH, Schuster SC: Comparative analysis of four Campylobacterales. Nat Rev Microbiol. 2004, 2 (11): 872-885. 10.1038/nrmicro1024.View ArticlePubMedGoogle Scholar
- Horimoto K, Fukuchi S, Mori K: Comprehensive comparison between locations of orthologous genes on archaeal and bacterial genomes. Bioinformatics. 2001, 17 (9): 791-802. 10.1093/bioinformatics/17.9.791.View ArticlePubMedGoogle Scholar
- Moreira LM, de Souza RF, Almeida NFJ, Setubal JC, Oliveira JC, Furlan LR, Ferro JA, da Silva AC: Comparative genomics analyses of citrus-associated bacteria. Annu Rev Phytopathol. 2004, 42: 163-184. 10.1146/annurev.phyto.42.040803.140310.View ArticlePubMedGoogle Scholar
- Read TD, Brunham RC, Shen C, Gill SR, Heidelberg JF, White O, Hickey EK, Peterson J, Utterback T, Berry K, Bass S, Linher K, Weidman J, Khouri H, Craven B, Bowman C, Dodson R, Gwinn M, Nelson W, DeBoy R, Kolonay J, McClarty G, Salzberg SL, Eisen J, Fraser CM: Genome sequences of Chlamydia trachomatis MoPn and Chlamydia pneumoniae AR39. Nucleic Acids Res. 2000, 28 (6): 1397-1406. 10.1093/nar/28.6.1397.PubMed CentralView ArticlePubMedGoogle Scholar
- Krawiec S, Riley M: Organization of the bacterial chromosome. Microbiol Rev. 1990, 54 (4): 502-539.PubMed CentralPubMedGoogle Scholar
- Paulsen IT, Seshadri R, Nelson KE, Eisen JA, Heidelberg JF, Read TD, Dodson RJ, Umayam L, Brinkac LM, Beanan MJ, Daugherty SC, Deboy RT, Durkin AS, Kolonay JF, Madupu R, Nelson WC, Ayodeji B, Kraul M, Shetty J, Malek J, Van Aken SE, Riedmuller S, Tettelin H, Gill SR, White O, Salzberg SL, Hoover DL, Lindler LE, Halling SM, Boyle SM, Fraser CM: The Brucella suis genome reveals fundamental similarities between animal and plant pathogens and symbionts. Proc Natl Acad Sci U S A. 2002, 99 (20): 13148-13153. 10.1073/pnas.192319099.PubMed CentralView ArticlePubMedGoogle Scholar
- Galibert F, Finan TM, Long SR, Puhler A, Abola P, Ampe F, Barloy-Hubler F, Barnett MJ, Becker A, Boistard P, Bothe G, Boutry M, Bowser L, Buhrmester J, Cadieu E, Capela D, Chain P, Cowie A, Davis RW, Dreano S, Federspiel NA, Fisher RF, Gloux S, Godrie T, Goffeau A, Golding B, Gouzy J, Gurjal M, Hernandez-Lucas I, Hong A, Huizar L, Hyman RW, Jones T, Kahn D, Kahn ML, Kalman S, Keating DH, Kiss E, Komp C, Lelaure V, Masuy D, Palm C, Peck MC, Pohl TM, Portetelle D, Purnelle B, Ramsperger U, Surzycki R, Thebault P, Vandenbol M, Vorholter FJ, Weidner S, Wells DH, Wong K, Yeh KC, Batut J: The composite genome of the legume symbiont Sinorhizobium meliloti. Science. 2001, 293 (5530): 668-672.View ArticlePubMedGoogle Scholar
- Goodner B, Hinkle G, Gattung S, Miller N, Blanchard M, Qurollo B, Goldman BS, Cao Y, Askenazi M, Halling C, Mullin L, Houmiel K, Gordon J, Vaudin M, Iartchouk O, Epp A, Liu F, Wollam C, Allinger M, Doughty D, Scott C, Lappas C, Markelz B, Flanagan C, Crowell C, Gurson J, Lomo C, Sear C, Strub G, Cielo C, Slater S: Genome Sequence of the Plant Pathogen and Biotechnology Agent Agrobacterium tumefaciens C58. Science. 2001, 294 (5550): 2323-2328. 10.1126/science.1066803.View ArticlePubMedGoogle Scholar
- Doolittle WF: Lateral genomics. Trends Cell Biol. 1999, 9 (12): M5-8. 10.1016/S0962-8924(99)01664-5.View ArticlePubMedGoogle Scholar
- Ochman H, Lawrence JG, Groisman EA: Lateral gene transfer and the nature of bacterial innovation. Nature. 2000, 405 (6784): 299-304. 10.1038/35012500.View ArticlePubMedGoogle Scholar
- Koski LB, Morton RA, Golding GB: Codon bias and base composition are poor indicators of horizontally transferred genes. Mol Biol Evol. 2001, 18 (3): 404-412.View ArticlePubMedGoogle Scholar
- Wang B: Limitations of compositional approach to identifying horizontally transferred genes. J Mol Evol. 2001, 53 (3): 244-250. 10.1007/s002390010214.View ArticlePubMedGoogle Scholar
- Medrano-Soto A, Moreno-Hagelsieb G, Vinuesa P, Christen JA, Collado-Vides J: Successful lateral transfer requires codon usage compatibility between foreign genes and recipient genomes. Mol Biol Evol. 2004, 21 (10): 1884-1894. 10.1093/molbev/msh202.View ArticlePubMedGoogle Scholar
- Hao W, Golding GB: Patterns of bacterial gene movement. Mol Biol Evol. 2004, 21 (7): 1294-1307. 10.1093/molbev/msh129.View ArticlePubMedGoogle Scholar
- Jain R, Rivera MC, Moore JE, Lake JA: Horizontal gene transfer accelerates genome innovation and evolution. Mol Biol Evol. 2003, 20 (10): 1598-1602. 10.1093/molbev/msg154.View ArticlePubMedGoogle Scholar
- Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO: Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc Natl Acad Sci U S A. 1999, 96 (8): 4285-4288. 10.1073/pnas.96.8.4285.PubMed CentralView ArticlePubMedGoogle Scholar
- Overbeek R, Fonstein M, D'Souza M, Pusch GD, Maltsev N: The use of gene clusters to infer functional coupling. Proc Natl Acad Sci U S A. 1999, 96 (6): 2896-2901. 10.1073/pnas.96.6.2896.PubMed CentralView ArticlePubMedGoogle Scholar
- Marcotte EM, Pellegrini M, Ho-Leung N, Rice D, Yeates T, Elsenberg D: Detecting protein function and protein-protein interactions from genome sequences. Science. 1999, 285 (5428): 751-753. 10.1126/science.285.5428.751.View ArticlePubMedGoogle Scholar
- Yanai I, DeLisi C: The society of genes: networks of functional links between genes from comparative genomics. Genome Biol. 2002, 3 (11): R0064-10.1186/gb-2002-3-11-research0064.View ArticleGoogle Scholar
- Bowers PM, Pellegrini M, Thompson MJ, Fierro J, Yeates TO, Eisenberg D: Prolinks: a database of protein functional linkages derived from coevolution. Genome Biol. 2004, 5 (5): R35-10.1186/gb-2004-5-5-r35.PubMed CentralView ArticlePubMedGoogle Scholar
- von Mering C, Zdobnov EM, Tsoka S, Ciccarelli FD, Pereira-Leal JB, Ouzounis CA, Bork P: Genome evolution reveals biochemical networks and functional modules. Proc Natl Acad Sci U S A. 2003, 100 (26): 15428-15433. 10.1073/pnas.2136809100.PubMed CentralView ArticlePubMedGoogle Scholar
- Freiberg C, Fellay R, Bairoch A, Broughton WJ, Rosenthal A, Perret X: Molecular basis of symbiosis between Rhizobium and legumes. Nature. 1997, 387 (6631): 394-401. 10.1038/387394a0.View ArticlePubMedGoogle Scholar
- Gonzalez V, Bustos P, Ramirez-Romero MA, Medrano-Soto A, Salgado H, Hernandez-Gonzalez I, Hernandez-Celis JC, Quintero V, Moreno-Hagelsieb G, Girard L, Rodriguez O, Flores M, Cevallos MA, Collado-Vides J, Romero D, Davila G: The mosaic structure of the symbiotic plasmid of Rhizobium etli CFN42 and its relation to other symbiotic genome compartments. Genome Biol. 2003, 4 (6): R36-10.1186/gb-2003-4-6-r36.PubMed CentralView ArticlePubMedGoogle Scholar
- Gottfert M, Rothlisberger S, Kundig C, Beck C, Marty R, Hennecke H: Potential symbiosis-specific genes uncovered by sequencing a 410-kilobase DNA region of the Bradyrhizobium japonicum chromosome. J Bacteriol. 2001, 183 (4): 1405-1412. 10.1128/JB.183.4.1405-1412.2001.PubMed CentralView ArticlePubMedGoogle Scholar
- Hirsch AM, Drake D, Jacobs TW, Long SR: Nodules are induced on alfalfa roots by Agrobacterium tumefaciens and Rhizobium trifolii containing small segments of the Rhizobium meliloti nodulation region. J Bacteriol. 1985, 161 (1): 223-230.PubMed CentralPubMedGoogle Scholar
- Martinez E, Palacios R, Sanchez F: Nitrogen-fixing nodules induced by Agrobacterium tumefaciens harboring Rhizobium phaseoli plasmids. J Bacteriol. 1987, 169 (6): 2828-2834.PubMed CentralPubMedGoogle Scholar
- Weller SA, Stead DE, Young JP: Acquisition of an Agrobacterium Ri plasmid and pathogenicity by other alpha-Proteobacteria in cucumber and tomato crops affected by root mat. Appl Environ Microbiol. 2004, 70 (5): 2779-2785. 10.1128/AEM.70.5.2779-2785.2004.PubMed CentralView ArticlePubMedGoogle Scholar
- Moran NA, Wernegreen JJ: Lifestyle evolution in symbiotic bacteria: insights from genomics. Trends In Ecology And Evolution. 2000, 15 (8): 321-326. 10.1016/S0169-5347(00)01902-9.View ArticlePubMedGoogle Scholar
- Kaneko T, Nakamura Y, Sato S, Asamizu E, Kato T, Sasamoto S, Watanabe A, Idesawa K, Ishikawa A, Kawashima K, Kimura T, Kishida Y, Kiyokawa C, Kohara M, Matsumoto M, Matsuno A, Mochizuki Y, Nakayama S, Nakazaki N, Shimpo S, Sugimoto M, Takeuchi C, Yamada M, Tabata S: Complete genome structure of the nitrogen-fixing symbiotic bacterium Mesorhizobium loti. DNA Res. 2000, 7 (6): 331-338. 10.1093/dnares/7.6.331.View ArticlePubMedGoogle Scholar
- Kaneko T, Nakamura Y, Sato S, Minamisawa K, Uchiumi T, Sasamoto S, Watanabe A, Idesawa K, Iriguchi M, Kawashima K, Kohara M, Matsumoto M, Shimpo S, Tsuruoka H, Wada T, Yamada M, Tabata S: Complete genomic sequence of nitrogen-fixing symbiotic bacterium Bradyrhizobium japonicum USDA110. DNA Res. 2002, 9 (6): 189-197. 10.1093/dnares/9.6.189.View ArticlePubMedGoogle Scholar
- Wood DW, Setubal JC, Kaul R, Monks DE, Kitajima JP, Okura VK, Zhou Y, Chen L, Wood GE, Almeida NFJ, Woo L, Chen Y, Paulsen IT, Eisen JA, Karp PD, Bovee DS, Chapman P, Clendenning J, Deatherage G, Gillet W, Grant C, Kutyavin T, Levy R, Li MJ, McClelland E, Palmieri A, Raymond C, Rouse G, Saenphimmachak C, Wu Z, Romero P, Gordon D, Zhang S, Yoo H, Tao Y, Biddle P, Jung M, Krespan W, Perry M, Gordon-Kamm B, Liao L, Kim S, Hendrick C, Zhao ZY, Dolan M, Chumley F, Tingey SV, Tomb JF, Gordon MP, Olson MV, Nester EW: The genome of the natural genetic engineer Agrobacterium tumefaciens C58. Science. 2001, 294 (5550): 2317-2323. 10.1126/science.1066804.View ArticlePubMedGoogle Scholar
- DelVecchio VG, Kapatral V, Redkar RJ, Patra G, Mujer C, Los T, Ivanova N, Anderson I, Bhattacharyya A, Lykidis A, Reznik G, Jablonski L, Larsen N, D'Souza M, Bernal A, Mazur M, Goltsman E, Selkov E, Elzer PH, Hagius S, O'Callaghan D, Letesson JJ, Haselkorn R, Kyrpides N, Overbeek R: The genome sequence of the facultative intracellular pathogen Brucella melitensis. Proc Natl Acad Sci U S A. 2002, 99 (1): 443-448. 10.1073/pnas.221575398.PubMed CentralView ArticlePubMedGoogle Scholar
- Larimer FW, Chain P, Hauser L, Lamerdin J, Malfatti S, Do L, Land ML, Pelletier DA, Beatty JT, Lang AS, Tabita FR, Gibson JL, Hanson TE, Bobst C, Torres JL, Peres C, Harrison FH, Gibson J, Harwood CS: Complete genome sequence of the metabolically versatile photosynthetic bacterium Rhodopseudomonas palustris. Nat Biotechnol. 2004, 22 (1): 55-61. 10.1038/nbt923.View ArticlePubMedGoogle Scholar
- COGs: Phylogenetic classification of protein encoded in complete genomes database. [http://0-www.ncbi.nlm.nih.gov.brum.beds.ac.uk/COG]
- Gerstein M: Patterns of protein-fold usage in eight microbial genomes: a comprehensive structural census. Proteins. 1998, 33 (4): 518-534. 10.1002/(SICI)1097-0134(19981201)33:4<518::AID-PROT5>3.0.CO;2-J.View ArticlePubMedGoogle Scholar
- Gerstein M, Levitt M: A structural census of the current population of protein sequences. Proc Natl Acad Sci U S A. 1997, 94 (22): 11911-11916. 10.1073/pnas.94.22.11911.PubMed CentralView ArticlePubMedGoogle Scholar
- Krogh A, Larsson B, von Heijne G, Sonnhammer EL: Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol. 2001, 305 (3): 567-580. 10.1006/jmbi.2000.4315.View ArticlePubMedGoogle Scholar
- Fernandez-Herrero LA, Badet-Denisot MA, Badet B, Berenguer J: glmS of Thermus thermophilus HB8: an essential gene for cell-wall synthesis identified immediately upstream of the S-layer gene. Mol Microbiol. 1995, 17 (1): 1-12.View ArticlePubMedGoogle Scholar
- Mac Siomoin RA, Nakata N, Murai T, Yoshikawa M, Tsuji H, Sasakawa C: Identification and characterization of ispA, a Shigella flexneri chromosomal gene essential for normal in vivo cell division and intracellular spreading. Mol Microbiol. 1996, 19 (3): 599-609. 10.1046/j.1365-2958.1996.405941.x.View ArticlePubMedGoogle Scholar
- Maggio-Hall LA, Claas KR, Escalante-Semerena JC: The last step in coenzyme B(12) synthesis is localized to the cell membrane in bacteria and archaea. Microbiology. 2004, 150 (Pt 5): 1385-1395. 10.1099/mic.0.26952-0.View ArticlePubMedGoogle Scholar
- Wu G, Williams HD, Gibson F, Poole RK: Mutants of Escherichia coli affected in respiration: the cloning and nucleotide sequence of ubiA, encoding the membrane-bound p-hydroxybenzoate:octaprenyltransferase. J Gen Microbiol. 1993, 139 (8): 1795-1805.View ArticlePubMedGoogle Scholar
- Xia T, Song J, Zhao G, Aldrich H, Jensen RA: The aroQ-encoded monofunctional chorismate mutase (CM-F) protein is a periplasmic enzyme in Erwinia herbicola. J Bacteriol. 1993, 175 (15): 4729-4737.PubMed CentralPubMedGoogle Scholar
- Kosak ST, Groudine M: Gene order and dynamic domains. Science. 2004, 306 (5696): 644-647. 10.1126/science.1103864.View ArticlePubMedGoogle Scholar
- Boussau B, Karlberg EO, Frank AC, Legault BA, Andersson SG: Computational inference of scenarios for alpha-proteobacterial genome evolution. Proc Natl Acad Sci U S A. 2004, 101 (26): 9722-9727. 10.1073/pnas.0400975101.PubMed CentralView ArticlePubMedGoogle Scholar
- Sullivan JT, Ronson CW: Evolution of rhizobia by acquisition of a 500-kb symbiosis island that integrates into a phe-tRNA gene. Proc Natl Acad Sci U S A. 1998, 95 (9): 5145-5149. 10.1073/pnas.95.9.5145.PubMed CentralView ArticlePubMedGoogle Scholar
- Wong K, Golding GB: A phylogenetic analysis of the pSymB replicon from the Sinorhizobium meliloti genome reveals a complex evolutionary history. Can J Microbiol. 2003, 49 (4): 269-280. 10.1139/w03-037.View ArticlePubMedGoogle Scholar
- Prakash T, Ramakrishnan C, Dash D, Brahmachari SK: Conformational analysis of invariant peptide sequences in bacterial genomes. J Mol Biol. 2005, 345 (5): 937-955. 10.1016/j.jmb.2004.11.008.View ArticlePubMedGoogle Scholar
- Seshu J, Boylan JA, Hyde JA, Swingle KL, Gherardini FC, Skare JT: A conservative amino acid change alters the function of BosR, the redox regulator of Borrelia burgdorferi. Mol Microbiol. 2004, 54 (5): 1352-1363. 10.1111/j.1365-2958.2004.04352.x.View ArticlePubMedGoogle Scholar
- Winfield MD, Groisman EA: Phenotypic differences between Salmonella and Escherichia coli resulting from the disparate regulation of homologous genes. Proc Natl Acad Sci U S A. 2004, 101 (49): 17162-17167. 10.1073/pnas.0406038101.PubMed CentralView ArticlePubMedGoogle Scholar
- Knight CG, Kassen R, Hebestreit H, Rainey PB: Global analysis of predicted proteomes: functional adaptation of physical properties. Proc Natl Acad Sci U S A. 2004, 101 (22): 8390-8395. 10.1073/pnas.0307270101.PubMed CentralView ArticlePubMedGoogle Scholar
- Jordan IK, Rogozin IB, Wolf YI, Koonin EV: Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res. 2002, 12 (6): 962-968. 10.1101/gr.87702. Article published online before print in May 2002.PubMed CentralView ArticlePubMedGoogle Scholar
- Completed Microbial Genomes. [http://0-www.ncbi.nlm.nih.gov.brum.beds.ac.uk/genomes/MICROBES/Complete.html]
- Persson B, Argos P: Prediction of transmembrane segments in proteins utilising multiple sequence alignments. J Mol Biol. 1994, 237 (2): 182-192. 10.1006/jmbi.1994.1220.View ArticlePubMedGoogle Scholar
- Moreno-Hagelsieb G, Collado-Vides J: A powerful non-homology method for the prediction of operons in prokaryotes. Bioinformatics. 2002, 18 Suppl 1: S329-36.View ArticlePubMedGoogle Scholar
- Sokal RR, Rohlf FJ: Biometry: The pinciples and practice of statistics in biological research. Edited by: 3rd . 2003, New York , WH Freeman and Company, 109-117.Google Scholar
- Laquip Proteomic Team. [http://proteome.ibi.unicamp.br/tools]
- Pearson WR, Lipman DJ: Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A. 1988, 85 (8): 2444-2448.PubMed CentralView ArticlePubMedGoogle Scholar
- Software EMBO: EMBOSS. [http://www.emboss.org]
- Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22 (22): 4673-4680.PubMed CentralView ArticlePubMedGoogle Scholar
- Rivera MC, Jain R, Moore JE, Lake JA: Genomic evidence for two functionally distinct gene classes. Proc Natl Acad Sci U S A. 1998, 95 (11): 6239-6244. 10.1073/pnas.95.11.6239.PubMed CentralView ArticlePubMedGoogle Scholar
- Pajek. [http://vlado.fmf.uni-lj.si/pub/networks/pajek]
- Krieger CJ, Zhang P, Mueller LA, Wang A, Paley S, Arnaud M, Pick J, Rhee SY, Karp PD: MetaCyc: a multiorganism database of metabolic pathways and enzymes. Nucleic Acids Res. 2004, 32 (Database issue): D438-42. 10.1093/nar/gkh100.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.