- Research article
- Open Access
The metazoan history of the COE transcription factors. Selection of a variant HLH motif by mandatory inclusion of a duplicated exon in vertebrates
- Virginie Daburon†1,
- Sébastien Mella†1, 2,
- Jean-Louis Plouhinec3, 4,
- Sylvie Mazan3,
- Michèle Crozatier1 and
- Alain Vincent1Email author
© Daburon et al; licensee BioMed Central Ltd. 2008
Received: 21 January 2008
Accepted: 02 May 2008
Published: 02 May 2008
The increasing number of available genomic sequences makes it now possible to study the evolutionary history of specific genes or gene families. Transcription factors (TFs) involved in regulation of gene-specific expression are key players in the evolution of metazoan development. The low complexity COE (Collier/Olfactory-1/Early B-Cell Factor) family of transcription factors constitutes a well-suited paradigm for studying evolution of TF structure and function, including the specific question of protein modularity. Here, we compare the structure of coe genes within the metazoan kingdom and report on the mechanism behind a vertebrate-specific exon duplication.
COE proteins display a modular organisation, with three highly conserved domains : a COE-specific DNA-binding domain (DBD), an Immunoglobulin/Plexin/transcription (IPT) domain and an atypical Helix-Loop-Helix (HLH) motif. Comparison of the splice structure of coe genes between cnidariae and bilateriae shows that the ancestral COE DBD was built from 7 separate exons, with no evidence for exon shuffling with other metazoan gene families. It also confirms the presence of an ancestral H1LH2 motif present in all COE proteins which partly overlaps the repeated H2d-H2a motif first identified in rodent EBF. Electrophoretic Mobility Shift Assays show that formation of COE dimers is mediated by this ancestral motif. The H2d-H2a α-helical repetition appears to be a vertebrate characteristic that originated from a tandem exon duplication having taken place prior to the splitting between gnathostomes and cyclostomes. We put-forward a two-step model for the inclusion of this exon in the vertebrate transcripts.
Three main features in the history of the coe gene family can be inferred from these analyses: (i) each conserved domain of the ancestral coe gene was built from multiple exons and the same scattered structure has been maintained throughout metazoan evolution. (ii) There exists a single coe gene copy per metazoan genome except in vertebrates. The H2a-H2d duplication that is specific to vertebrate proteins provides an example of a novel vertebrate characteristic, which may have been fixed early in the gnathostome lineage. (iii) This duplication provides an interesting example of counter-selection of alternative splicing.
Thanks to the increasing number of available genomic sequences, it has become possible to study the evolutionary history of specific genes or gene families in relation to their co-option in innovations that have punctuated the evolutionary diversification of metazoans. Transcription factors (TFs) involved in regulation of gene-specific expression are key players in the evolution of development. The COE family of transcription factors takes its name from the founding members of the family, Collier (Col) and Olfactory-1/Early-B-Cell Factor (Olf-1/EBF) isolated from Drosophila and rodents, respectively [1–3]. While there was no evidence for coe genes in either fungi, plants, or any of the various phyla of protozoans, identification of a cnidarian coe gene, Nvcoe, in the anthozoan sea anemone Nemostella vectensis, suggested that COE proteins have appeared with metazoa , a conclusion strengthened by the identification of COE members both in other cnidaria and porifera . While up to 4 ebf paralogs have been identified in vertebrates [6–8], a single coe member has been identified in all the other animals for which genome sequences have become available, suggesting that expansion of the coe gene family only occurred at the origin of vertebrates.
Expression profiles of coe genes in embryos from various protostomes and deuterostomes and N. vectensis have revealed a common feature, namely, an expression in subsets of sensory neurons [4, 9–14]. This feature raised the possibility that one ancestral role of COE proteins was to participate in the specification of specialised sensory cells and the ontogeny of an elaborate nervous system . However, genetic analyses performed in mice and, more recently, Drosophila raised the possibility that another ancestral function of COE proteins could have been in development of cellular immunity [15, 16]. The diversity of COE protein functions strikingly contrasts with the high degree of primary sequence conservation and lack of expansion of this family of TFs throughout metazoan evolution. Owing to its low complexity, the COE family constitutes a well-suited paradigm for studying evolution of TF structure and function, including the specific question of protein modularity.
Pioneering analysis of EBF identified three functional domains [1, 17]: an amino-terminal, about 210 amino-acid long DBD which is the signature of COE proteins; ii) a Helix-Helix dimerisation motif made of two tandemly arranged α-helical repeats showing limited sequence similarity to the HLH motif described in basic helix-loop-helix (b-HLH) proteins; iii) a transcription-activating domain without marked specific signature. The presence of an Ig-like/Plexin/Transcription Factor (IPT) domain between the DBD and HLH domains was also noticed but the function of this domain remains unknown [18, 19]. Comparison between Col and EBF showed that the DBD, IPT and HLH domains have been particularly well conserved during evolution. However, one of the tandemly arranged α-helices noted in EBF/Olf-1 was missing. This, and further examination of the Col and EBF primary sequences led us to postulate the existence, in all COE proteins, of an HLH motif distinct from and partly overlapping the motif initially identified in EBF and Olf-1 [2, 11]. This motif is designated below as H1LH2 while the vertebrate-specific motif is designated as H2d-H2a, H2d and H2a (d for duplicated, a for ancestral) corresponding to the duplication of the single H2 helix found in Drosophila.
To get more insight into the evolutionary history of coe genes, we compared in detail their genomic structure between various metazoan phyla. This comparison shows that the metazoan ancestor COE DBD was built from at least 7 separate exons with no evidence for exon shuffling with other gene families. Detailed analysis of various chordate genomes and ESTs indicated that the H2d duplication has taken place in the vertebrate lineage prior to the two rounds of whole genome duplication characterising the origin of this taxon . It thus provides an example of a novel vertebrate characteristic, which may have been fixed early in the gnathostome lineage. It also revealed that the vertebrate-specific H2 duplication originated from a two-step tandem exon duplication. Careful inspection of the intron phases leads us to put forward an original scenario that involves the selection of a new splice donor site, resulting in the formation of a "cassette" H2d exon. We show here that, alike EBF, Col does bind to DNA as homodimer and that the ancestral H1LH2 motif mediates formation of Col/EBF homodimers and heterodimers. Incorporation of H2d in all four mammalian EBF proteins reveals an interesting example of compulsory counter-selection of alternative splicing following exon duplication.
Search for COE/EBF related sequences in genomic and ESTdatabases
Systematic searches for COE/EBF proteins were conducted in available databases using the BLAST algorithm and mouse COE sequences (Additional file 1) as query. The databases analysed included current versions of the genomes of Monosiga brevicollis, Nematostella vectensis, Capitella capitata, Lottia gigantae, Branchiostoma floridae , of Strongylocentrotus purpuratus  as well as the Ensembl databases of predicted proteins of Ciona intestinalis, Homo sapiens, Mus musculus, Monodelphis domestica, Ornithorynchus anatinus, Gallus gallus, Xenopus tropicalis and Danio rerio . In Petromyzon marinus, the genomic scaffolds containing COE/EBF coding sequences were retrieved from the pre-Ensembl genome version available at . Coding sequences were identified in these scaffolds with GeneWise  and assembled using homologous sequences (>95% identity) identified from Lampetra fluviatilis ESTs as templates. A survey of ESTs annotated in the databanks for alternative transcript variants included the use of AceView . The sponge Amphimedon queenslandica COE sequence was taken from .
Molecular phylogenetic analysis
The alignment of Coe/EBF protein sequences was obtained using MUSCLE  and checked by hand under Bioedit . Only full length sequences and unambiguously aligned segments were retained for the phylogenetic analysis (see Additional file 2). Neighbor-Joining (NJ), Maximum likelihood (ML) and bayesian (BI) phylogenetic reconstructions were conducted using the Mega3.1 software, PhyML  and MrBayes 3.0 . In each case, we used the JTT model of sequence evolution with invariant+gamma distribution rates. Bootstrap proportions (BP) were calculated by analysis of 1000 replicates for NJ and by the RELL method  on the 2000 top-ranking trees for ML analyses. In the BI analysis, four chains were run for 2 million iterations with default heating parameters and sampled every 500 iterations; the first 2000 trees were discarded as burn-in.
In vitro translation and Electrophoretic Mobility Shift Assays
The pEThBF1 , pET15bHis-mEBF1 (a gift from J. Hagman) and pET17bHis Col plasmids and deletions therein were used for in vitro transcription/translation of EBF, EBF*, EBFΔH1, EBFΔH2, Col, Col*, ColΔH1, ColΔH2 and ColΔH2L. To generate internal deletions corresponding to the H1 and H2 helices, we used the four oligonucleotides PCR method . In vitro transcription and translation using rabbit reticulocyte lysate was as described by the manufacturer (kit L1170, Promega). For each protein synthesised, the efficiency of translation was assessed by SDS PAGE of parallel translation reactions performed in the presence of 35Smethionine. Electrophoretic mobility shift assays (EMSA) were performed in the conditions described by , using either a 125 bp DNA fragment containing mb-1 promoter sequences (from -250 to -115) which includes the EBF binding site 5'-AGACTCaaGGGAAT-3' or the PAL probe which contains the palindromic site 5'-ATTCCCaaGGGAAT-3' [1, 33] and data not shown. Competition experiments were performed using a 100× molar excess of 30 bp oligonucleotides containing either the wild type 5' -CTAGAGAGAGACTCAA GG GAATTGTGGCCAGCCC- 3' or mutated CTAGAGAGAGACTCAA CC GAATTGTGGCCAGCCC- 3' mb-1 recognition site, as described in .
The P [col5cDNA]; col1 strain designated in Fig.S4 as col1 and UAS-col strains have been described in . The P [col5cDNA] transgene rescues the embryonic lethality but not the wing defects of col1 mutants. The UAS-Mm ebf and UAS-Mm ebf2 constructs were made by cloning the entire ebf/ebf2 open reading frame in the pUAST vector. Three independent lines were used for ectopic expression assays. All other stocks were obtained from the Bloomington Stock Center and described in Flybase .
The coe gene family
The metazoan ancestor COE DBD was built from multiple "unique" exons
The modified HLH motif of EBF proteins is a vertebrate innovation
A noticeable difference between EBF and Drosophila Col is the specific duplication of a short α-helical region in EBF (H2d-H2a tandem repetition, Fig. 2A,C). Because of remote sequence similarity, this H2 tandem repetition was originally proposed to constitute a dimerisation similar to that present in b-HLH proteins found in fungi, plants and metazoans [1, 3, 43]. The absence of H2d in Drosophila Col led us, however, to propose the existence of an alternative Helix-Linker-Helix (H1LH2) domain [2, 11]. Sequence comparison of a wide range of metazoans shows that the H1LH2 motif is an ancestral character (see Additional file 2). The predicted primary sequence of COE proteins in cephalochordates and urochordates, which are considered as the closest living relatives of the vertebrate ancestor [12, 44, 45], suggested that the H2d-H2a duplication was a vertebrate-specific feature. To confirm a conclusion mostly based on ESTs analysis, we retrieved the intronic sequences comprised between the H1 and H2a coding exons in representatives of the major chordate groups outside vertebrates, C. intestinalis, B. floridae and S. purpuratus and verified the absence of H2d-related coding sequence (see Additional file 3). In contrast, all the gnathostome sequences retrieved from the genomes analysed, including not only the four COE1-3 and EBF4 classes but also the unassigned zebrafish DrCOE sequence, exhibit the H2d addition, strongly suggesting that the H2a-H2d duplication predated the gnathostome radiation (see Additional file 2). In the lamprey P. marinus, we found no evidence for the presence of H2d in the Pmcoe-A locus, but a definitive conclusion could not be obtained in this case due to sequence gaps between the H1 and H2a coding regions in the current genome version. In contrast, the presence of H2d could be unambiguously recognised, and at the expected position, in the deduced PmCOE-B amino acid sequence (see Additional file 3). Preliminary EST analyses of a closely related lamprey species, Lampetra fluviatilis, confirmed the presence of H2d in transcripts of the orthologous Lfcoe-B gene but also highlighted the presence of alternatively spliced forms, devoid of the duplicated H2 sequence (see Additional file 4). The presence of H2d in lamprey LCOE-B may thus be subject to alternative splicing, while no indication for a similar process has been obtained thus far in gnathostomes. Taken together, these data indicate that the H2 duplication occurred early in the vertebrate lineage, prior to splitting between gnathostomes and cyclostomes, in a single copy ancestral gene from which all gnathostome and at least the lamprey coe-A genes are derived. They also suggest that this additional protein domain may have been fixed early in the gnathostome lineage.
The COE-HLH dimerisation motif revisited
A two-step evolutionary scenario for inclusion of H2d in vertebrate EBF
The COE family of transcription factors was first defined by the sequence similarity between rodent EBF/Olf-1 and Drosophila Col [1–3]. Cloning of a coe cDNA from the cnidarian N. vectensis and identification of coe sequences in another cnidarian, hydra magnipapillata and a poriferan, the sponge Amphimedon queenslandica [4, 5] strengthened the conclusion that coe genes are metazoan-specific genes. Our systematic blast-search for coe orthologs in DNA sequence databanks confirmed that coe genes are metazoan genes present at a single copy per genome, except for vertebrates. It further showed a remarkable degree of conservation of the coe genomic structure throughout metazoan evolution, except for one exon duplication in the vertebrate lineage.
The scattered structure of coe genes
All introns found in the cnidaria N. vectensis Nvcoe gene are also found, at the same position, in deuterostomes and at least one of the protostomes examined, suggesting that this scattered organisation was already present in the metazoan ancestral coe gene. In case of the DBD, which is both specific of COE proteins and conserved to the same degree over its entire length, a split structure into 7 exons was rather unexpected. Moreover, we could not find evidence for exon shuffling with other gene families, consistent with the conserved asymmetric intron phases, but leaving intact the question of the genomic building up of this unique DNA binding domain. The HCCC zinc finger structure proposed to be an essential feature of the EBF DNA-binding domain  is itself encoded by two exons, already in the last common cnidarian/bilaterian ancestor (E5 and E6, Fig. 2B), suggesting a bipartite origin. Since exon E6 is symmetrical, it can possibly be subject to regulated exon-skipping, allowing for the production of different protein isoforms, with putatively different functions. Whereas there is some preliminary evidence for it, as a subclass of human EBF1 cDNAs may differ from the main class by the loss of exon E6 (AceView;  the i5 intron has been lost in some protostomes, such as Drosophila melanogaster (Fig. 2B). Systematic genome sequencing programs should soon give access to the coe gene structure in many additional phyla, including sister clades of bilateriae. It offers the exciting prospect of deeper insight into the evolutionary roots of the coe gene family and their scattered genomic organisation.
The ancestral COE HLH motif revisited
Sequence similarity of the ancestral COE H1LH2 motif with the HLH motif of basic-HLH proteins  has led to classify COE proteins as one distant subgroup in this superfamily of proteins, despite displaying distinctive DBD and additional protein domains [5, 51]. In vitro DNA binding assays show that the H1LH2 motif is required for binding of COE proteins to DNA as dimers. This conclusion differs from the initial report that the EBF dimerisation motif was H2d-H2a, a conclusion supported by the analysis of two different deletions in EBF. Indeed, an internal deletion of EBF removing amino acids 296 to 367 (EBFΔ296–367), namely H1 and part of the IPT domain, was reported to lower but not prevent dimer formation . Since we found that removal of H1 alone in either EBF or Col abolished dimer formation, one possibility is that the presence of the IPT domain interferes with the ability of the H2d-H2a repeat to mediate homophilic interactions. In support of this possibility, the H2d-H2a repeat, when taken out of its normal context, is able to promote formation of dimers, as shown by using a truncated nuclear hormone receptor lacking its own dimerization domain ). The high degree of sequence conservation of the COE IPT domain (see Additional file 3) suggests that this domain is subject to very stringent structural and functional constraints. Together, our results from DNA-binding assays and those reported by , further suggest that the positioning of the IPT and HLH domains in relation to one another is a critical aspect of COE dimer formation. Hagman et al; 1995  also reported that a modified EBF protein lacking amino acids 370 to 383 (EBFΔ370–383), i.e., part of H2d, leaving intact H2a (see Additional file 5), showed a drastically reduced level of binding to mb1 DNA, suggesting that H2d was essential for forming EBF homodimers. Yet, the 370 to 383 a.a. deletion does not only remove part of H2d but also part of the linker separating H1 and H2 (see Additional file 5). Our data suggest that it is removal of this linker rather than H2d itself which prevents EBF dimer formation. The conservation of sequence and genomic structure of this Proline-rich linker throughout metazoan evolution (see Additional files 2 and 3) supports a key role in positioning H1 and H2 relative to each other and contribution to the DNA-binding specificity of COE dimers. While efficient in vitro binding to DNA of either Col dimers, Col/EBF heterodimers or dimers of EBF isoforms lacking either H2a (Fig. 3) or H2d  indicates that the H2 duplication is not essential for EBF dimer formation, inclusion of a duplicated helix2 raises the interesting possibility that it could result in an increased partnership flexibility and functional versatility of the vertebrate COE proteins. The observation that Col/EBF heterodimers more efficiently form and/or bind to DNA, at least in vitro, raises the speculative hypothesis that it could have been the initial force behind the selection of H2 exon inclusion.
Counter-selection of alternative splicing
Together, the compared structures of vertebrate and urochordate coe genes between echinoderms, cephalochordates, urochordates and a wide range of vertebrates, including cyclostomes suggest that the duplication of H2 occurred in the vertebrate ancestor and resulted from an exon-duplication event. This is the only major change in the modular structure of COE proteins that appears to have been fixed throughout metazoan evolution. Exon duplication is one widely used mechanism for adding a coding region within an existing gene. Alternative splicing of duplicated exons has been postulated to favor protein diversification, since each exon can, in principle, evolve independently of the other [48, 49]. Recent genomic studies have suggested that 40–60% of human genes are alternatively spliced and comparative analysis of close to 10,000 orthologous genes in human and mouse has shown that alternative splicing is frequently associated with recent exon creation and/or loss . However, other studies suggest that the contribution of gene duplication, followed by sequence divergence and alternative splicing to the diversification of the protein repertoire could be substantially different . In the case of the vertebrate coe genes, alternative splicing was not selected by evolution following exon H2d duplication, since both H2 repeats are incorporated in the EBF proteins. Taking into account the splice frame rules, we put forward here an original two-step model to account for the inclusion of H2d in vertebrate COE proteins (Fig. 4). The first step in our model is a classical tandem duplication of an "ancestral" H2a coding exon. However, this exon was probably not symmetrical (see Fig. 2B) and, due the splice frame rule, only the ancestral or the duplicated exon could be incorporated in the coding transcript without disrupting the open reading frame, a classical case of mandatory alternative splicing [48, 49]. We believe that inclusion of the duplicated exon occured via the activation of a phase 0 splice donor site, 3' to H2 in the duplicated exon This allowed the incorporating of H2d, while preserving the open reading frame (see Fig. 4D). To our knowledge such a two-step selection of a cassette exon has not yet been invoked for other proteins.
While our data underline the conservation of coe protein structure throughout evolution, the molecular mechanisms underlying the cell-context dependence of COE regulatory targets remains unknown. For example, mouse EBF/COE1 or EBF2/COE2 can substitute for Col activity in UAS-Gal4 transgenic assays , using as a paradigm Col function in patterning of the wing , indicating that Col and EBF are able to regulate similar set of genes in a tissue-dependent manner (see Additional file 6). So far, little insight was obtained from systematic searches for EBF or Col directly protein interactors [55, 56]. This remains a pre-eminent question in view of the evolutionary diversification of the biological functions of COE proteins revealed by mutant analyses in both mouse, C. elegans and Drosophila [57–60]. Within this context, more extensive analysis of genomic structure, expression and function of COE proteins in other phyla could be of primary interest.
Our systematic blast-search for coe (collier/olf-1/ebf) orthologs in DNA sequence databanks confirmed that coe genes are metazoan genes present at a single copy per genome, except for vertebrates. It further showed a remarkable degree of conservation of the coe genomic structure throughout metazoan evolution, except for one exon duplication in the vertebrate lineage, leading to a modified dimerisation domain of structure H1lH2dH2a in vertebrates and HLH2a in all other metazoans. Taking into account the splice frame rules, we put forward here an original two-step duplication model to account for H2d inclusion in vertebrate COE proteins The vertebrate gene configuration is such that it remains possible to remove H2d through alternative splicing, through exon-skipping. However, the presence of both H2d and H2a in all gnathostome coe/ebf transcripts characterised to date both indicates that, in this case, exon-skipping is highly counter-selected. While in vitro experiments indicate that the H2 duplication is not essential for binding of COE proteins to DNA as dimers, it raises the interesting possibility that it could result in an increased partnership flexibility and functional versatility of the vertebrate COE proteins.
We are grateful to Patrick Wincker and Corinne Da Silva (Genoscope and UMR 8030) for help with lamprey ESTs sequencing. We also thank Julian Smith and Jean Deutsch for critical reading of early versions of the manuscript and Serge Plaza and members of our laboratory for discussion. This research was supported by CNRS and Ministère de la Recherche (ACI BCMS) and CNRG. S. Mella was supported by a doctoral fellowship from Ministère de la Recherche.
- Hagman J, Belanger C, Travis A, Turck CW, Grosschedl R: Cloning and functional characterization of early B-cell factor, a regulator of lymphocyte-specific gene expression. Genes Dev. 1993, 7: 760-73. 10.1101/gad.7.5.760.View ArticlePubMedGoogle Scholar
- Crozatier M, Valle D, Dubois L, Ibnsouda S, Vincent A: Collier, a novel regulator of Drosophila head development, is expressed in a single mitotic domain. Curr Biol. 1996, 6: 707-18. 10.1016/S0960-9822(09)00452-7.View ArticlePubMedGoogle Scholar
- Wang MM, Reed RR: Molecular cloning of the olfactory neuronal transcription factor Olf-1 by genetic selection in yeast. Nature. 1993, 364: 121-6. 10.1038/364121a0.View ArticlePubMedGoogle Scholar
- Pang K, Matus DQ, Martindale MQ: The ancestral role of COE genes may have been in chemoreception: evidence from the development of the sea anemone, Nematostella vectensis (Phylum Cnidaria; Class Anthozoa). Dev Genes Evol. 2004, 214: 134-8. 10.1007/s00427-004-0383-7.View ArticlePubMedGoogle Scholar
- Simionato E, Ledent V, Richards G, Thomas-Chollier M, Kerner P, Coornaert D, Degnan BM, Vervoort M: Origin and diversification of the basic helix-loop-helix gene family in metazoans: insights from comparative genomics. BMC Evol Biol. 2007, 7: 33-10.1186/1471-2148-7-33.PubMed CentralView ArticlePubMedGoogle Scholar
- Garel S, Marin F, Mattei MG, Vesque C, Vincent A, Charnay P: Family of Ebf/Olf-1-related genes potentially involved in neuronal differentiation and regional specification in the central nervous system. Dev Dyn. 1997, 210: 191-205. 10.1002/(SICI)1097-0177(199711)210:3<191::AID-AJA1>3.0.CO;2-B.View ArticlePubMedGoogle Scholar
- Wang SS, Betz AG, Reed RR: Cloning of a novel Olf-1/EBF-like gene, O/E-4, by degenerate oligo-based direct selection. Mol Cell Neurosci. 2002, 20: 404-14. 10.1006/mcne.2002.1138.View ArticlePubMedGoogle Scholar
- SS Tsai, Wang RY, Reed RR: The characterization of the Olf-1/EBF-like HLH transcription factor family: implications in olfactory gene regulation and neuronal development. J Neurosci. 1997, 17: 4149-58.Google Scholar
- Dubois L, Bally-Cuif L, Crozatier M, Moreau J, Paquereau L, Vincent A: XCoe2, a transcription factor of the Col/Olf-1/EBF family involved in the specification of primary neurons in Xenopus. Curr Biol. 1998, 8: 199-209. 10.1016/S0960-9822(98)70084-3.View ArticlePubMedGoogle Scholar
- Prasad BC, Ye B, Zackhary R, Schrader K, Seydoux G, Reed RR: unc-3, a gene required for axonal guidance in Caenorhabditis elegans, encodes a member of the O/E family of transcription factors. Development. 1998, 125: 1561-8.PubMedGoogle Scholar
- Dubois L, Vincent A: The COE–Collier/Olf1/EBF–transcription factors: structural conservation and diversity of developmental functions. Mech Dev. 2001, 108: 3-12. 10.1016/S0925-4773(01)00486-5.View ArticlePubMedGoogle Scholar
- Mazet F, Masood S, Luke GN, Holland ND, Shimeld SM: Expression of AmphiCoe, an amphioxus COE/EBF gene, in the developing central nervous system and epidermal sensory neurons. Genesis. 2004, 38: 58-65. 10.1002/gene.20006.View ArticlePubMedGoogle Scholar
- Wang SS, Lewcock JW, Feinstein P, Mombaerts P, Reed RR: Genetic disruptions of O/E2 and O/E3 genes reveal involvement in olfactory receptor neuron projection. Development. 2004, 131: 1377-88. 10.1242/dev.01009.View ArticlePubMedGoogle Scholar
- Kim K, Colosimo ME, Yeung H, Sengupta P: The UNC-3 Olf/EBF protein represses alternate neuronal programs to specify chemosensory neuron identity. Dev Biol. 2005, 286: 136-48. 10.1016/j.ydbio.2005.07.024.View ArticlePubMedGoogle Scholar
- Lin H, Grosschedl R: Failure of B-cell differentiation in mice lacking the transcription factor EBF. Nature. 1995, 376: 263-7. 10.1038/376263a0.View ArticlePubMedGoogle Scholar
- Crozatier M, Ubeda JM, Vincent A, Meister M: Cellular immune response to parasitization in Drosophila requires the EBF orthologue collier. PLoS Biol. 2004, 2: E196-10.1371/journal.pbio.0020196.PubMed CentralView ArticlePubMedGoogle Scholar
- Hagman J, Gutch MJ, Lin H, Grosschedl R: EBF contains a novel zinc coordination motif and multiple dimerization and transcriptional activation domains. Embo J. 1995, 14: 2907-16.PubMed CentralPubMedGoogle Scholar
- Bork P, Doerks T, Springer TA, Snel B: Domains in plexins: links to integrins and transcription factors. Trends Biochem Sci. 1999, 24: 261-3. 10.1016/S0968-0004(99)01416-4.View ArticlePubMedGoogle Scholar
- Liberg D, Sigvardsson M, Akerblad P: The EBF/Olf/Collier family of transcription factors: regulators of differentiation in cells originating from all three embryonal germ layers. Mol Cell Biol. 2002, 22: 8389-97. 10.1128/MCB.22.24.8389-8397.2002.PubMed CentralView ArticlePubMedGoogle Scholar
- Dehal P, Boore JL: Two rounds of whole genome duplication in the ancestral vertebrate. PLoS Biol. 2005, 3: e314-10.1371/journal.pbio.0030314.PubMed CentralView ArticlePubMedGoogle Scholar
- Eukaryotic Genomics. [http://genome.jgi-psf.org/]
- Sea Urchin Genome Project. [http://www.hgsc.bcm.tmc.edu/projects/seaurchin/]
- Ensembl Genomes. [http://www.ensembl.org/index.html]
- Pre!Ensembl lamprey. [http://pre.ensembl.org/Petromyzon_marinus/index.html]
- Birney E, Clamp M, Durbin R: GeneWise and Genomewise. Genome Res. 2004, 14: 988-95. 10.1101/gr.1865504.PubMed CentralView ArticlePubMedGoogle Scholar
- Thierry-Mieg D, Thierry-Mieg J: AceView: a comprehensive cDNA-supported gene and transcripts annotation. Genome Biol. 2006, 7 (Suppl 1): 1-14. 10.1186/gb-2006-7-s1-s12.View ArticlePubMedGoogle Scholar
- Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32: 1792-7. 10.1093/nar/gkh340.PubMed CentralView ArticlePubMedGoogle Scholar
- BioEdit. [http://www.mbio.ncsu.edu/BioEdit/bioedit.html]
- Guindon S, Gascuel O: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol. 2003, 52: 696-704. 10.1080/10635150390235520.View ArticlePubMedGoogle Scholar
- Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003, 19: 1572-4. 10.1093/bioinformatics/btg180.View ArticlePubMedGoogle Scholar
- Kishino H, Hasegawa M: Converting distance to time: application to human evolution. Methods Enzymol. 1990, 183: 550-70.View ArticlePubMedGoogle Scholar
- Ho SN, Hunt HD, Horton RM, Pullen JK, Pease LR: Site-directed mutagenesis by overlap extension using the polymerase chain reaction. Gene. 1989, 77: 51-9. 10.1016/0378-1119(89)90358-2.View ArticlePubMedGoogle Scholar
- Travis A, Hagman J, Hwang L, Grosschedl R: Purification of early-B-cell factor and characterization of its DNA-binding specificity. Mol Cell Biol. 1993, 13: 3392-400.PubMed CentralView ArticlePubMedGoogle Scholar
- Crozatier M, Glise B, Vincent A: Connecting Hh, Dpp and EGF signalling in patterning of the Drosophila wing; the pivotal role of collier/knot in the AP organiser. Development. 2002, 129: 4261-9.PubMedGoogle Scholar
- FlyBase. [http://flybase.bio.indiana.edu/]
- Doolittle RF: The multiplicity of domains in proteins. Annu Rev Biochem. 1995, 64: 287-314. 10.1146/annurev.bi.64.070195.001443.View ArticlePubMedGoogle Scholar
- Lynch M, Conery JS: The origins of genome complexity. Science. 2003, 302: 1401-4. 10.1126/science.1089370.View ArticlePubMedGoogle Scholar
- Carmel L, Rogozin IB, Wolf YI, Koonin EV: Evolutionarily conserved genes preferentially accumulate introns. Genome Res. 2007, 17: 1045-50. 10.1101/gr.5978207.PubMed CentralView ArticlePubMedGoogle Scholar
- Carmel L, Wolf YI, Rogozin IB, Koonin EV: Three distinct modes of intron dynamics in the evolution of eukaryotes. Genome Res. 2007, 17: 1034-44. 10.1101/gr.6438607.PubMed CentralView ArticlePubMedGoogle Scholar
- Rogozin IB, Wolf YI, Sorokin AV, Mirkin BG, Koonin EV: Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Curr Biol. 2003, 13: 1512-7. 10.1016/S0960-9822(03)00558-X.View ArticlePubMedGoogle Scholar
- Patthy L: Intron-dependent evolution: preferred types of exons and introns. FEBS Lett. 1987, 214: 1-7. 10.1016/0014-5793(87)80002-9.View ArticlePubMedGoogle Scholar
- Roy SW, Gilbert W: The evolution of spliceosomal introns: patterns, puzzles and progress. Nat Rev Genet. 2006, 7: 211-21.PubMedGoogle Scholar
- Massari ME, Murre C: Helix-loop-helix proteins: regulators of transcription in eucaryotic organisms. Mol Cell Biol. 2000, 20: 429-40. 10.1128/MCB.20.2.429-440.2000.PubMed CentralView ArticlePubMedGoogle Scholar
- Schubert M, Holland ND, Escriva H, Holland LZ, Laudet V: Retinoic acid influences anteroposterior positioning of epidermal sensory neurons and their gene expression in a developing chordate (amphioxus). Proc Natl Acad Sci USA. 2004, 101: 10320-5. 10.1073/pnas.0403216101.PubMed CentralView ArticlePubMedGoogle Scholar
- Delsuc F, Brinkmann H, Chourrout D, Philippe H: Tunicates and not cephalochordates are the closest living relatives of vertebrates. Nature. 2006, 439: 965-8. 10.1038/nature04336.View ArticlePubMedGoogle Scholar
- Hagman J, Travis A, Grosschedl R: A novel lineage-specific nuclear factor regulates mb-1 gene transcription at the early stages of B cell differentiation. Embo J. 1991, 10: 3409-17.PubMed CentralPubMedGoogle Scholar
- Kudrycki K, Stein-Izsak C, Behn C, Grillo M, Akeson R, Margolis FL: Olf-1-binding site: characterization of an olfactory neuron-specific promoter motif. Mol Cell Biol. 1993, 13: 3002-14.PubMed CentralView ArticlePubMedGoogle Scholar
- Kondrashov FA, Koonin EV: Origin of alternative splicing by tandem exon duplication. Hum Mol Genet. 2001, 10: 2661-9. 10.1093/hmg/10.23.2661.View ArticlePubMedGoogle Scholar
- Letunic I, Copley RR, Bork P: Common exon duplication in animals and its role in alternative splicing. Hum Mol Genet. 2002, 11: 1561-7. 10.1093/hmg/11.13.1561.View ArticlePubMedGoogle Scholar
- Graveley BR: Alternative splicing: increasing diversity in the proteomic world. Trends Genet. 2001, 17: 100-7. 10.1016/S0168-9525(00)02176-4.View ArticlePubMedGoogle Scholar
- Ledent V, Vervoort M: The basic helix-loop-helix protein family: comparative genomics and phylogenetic analysis. Genome Res. 2001, 11: 754-70. 10.1101/gr.177001.PubMed CentralView ArticlePubMedGoogle Scholar
- Modrek B, Lee CJ: Alternative splicing in the human, mouse and rat genomes is associated with an increased frequency of exon creation and/or loss. Nat Genet. 2003, 34: 177-80. 10.1038/ng1159.View ArticlePubMedGoogle Scholar
- Talavera D, Vogel C, Orozco M, Teichmann SA, de la Cruz X: The (in)dependence of alternative splicing and gene duplication. PLoS Comput Biol. 2007, 3: e33-10.1371/journal.pcbi.0030033.PubMed CentralView ArticlePubMedGoogle Scholar
- Brand AH, Perrimon N: Targeted gene expression as a means of altering cell fates and generating dominant phenotypes. Development. 1993, 118: 401-15.PubMedGoogle Scholar
- Formstecher E, Aresta S, Collura V, Hamburger A, Meil A, Trehin A, Reverdy C, Betin V, Maire S, Brun C, et al: Protein interaction mapping: a Drosophila case study. Genome Res. 2005, 15: 376-84. 10.1101/gr.2659105.PubMed CentralView ArticlePubMedGoogle Scholar
- Tsai RY, Reed RR: Identification of DNA recognition sequences and protein interaction domains of the multiple-Zn-finger protein Roaz. Mol Cell Biol. 1998, 18: 6447-56.PubMed CentralView ArticlePubMedGoogle Scholar
- M Kieslinger, Folberth S, Dobreva G, Dorn T, Croci L, Erben R, Consalez GG, Grosschedl R: EBF2 regulates osteoblast-dependent differentiation of osteoclasts. Dev Cell. 2005, 9: 757-67. 10.1016/j.devcel.2005.10.009.View ArticleGoogle Scholar
- Croci L, Chung SH, Masserdotti G, Gianola S, Bizzoca A, Gennarini G, Corradi A, Rossi F, Hawkes R, Consalez GG: A key role for the HLH transcription factor EBF2COE2, O/E-3 in Purkinje neuron migration and cerebellar cortical topography. Development. 2006, 133: 2719-29. 10.1242/dev.02437.View ArticlePubMedGoogle Scholar
- Baumgardt M, Miguel-Aliaga I, Karlsson D, Ekman H, Thor S: Specification of neuronal identities by feedforward combinatorial coding. PLoS Biol. 2007, 5: e37-10.1371/journal.pbio.0050037.PubMed CentralView ArticlePubMedGoogle Scholar
- Lagergren A, Mansson R, Zetterblad J, Smith E, Basta B, Bryder D, Akerblad P, Sigvardsson M: The Cxcl12, periostin, and Ccl9 genes are direct targets for early B-cell factor in OP-9 stroma cells. J Biol Chem. 2007, 282: 14454-62. 10.1074/jbc.M610263200.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.