From: Taxonomic distribution and origins of the extended LHC (light-harvesting complex) antenna protein superfamily

Phylogeny, predicted primary and secondary structures of two-helix SEPs. (A) Phylogenetic analysis including all 39 identified SEP sequences (except the partial SEP3/Lil3 from Pinus taeda) from twelve photosynthetic eukaryotes. The maximum likelihood tree was inferred from 32 amino acid positions, only bootstrap values at nodes supported by a posterior probability of ≥0.50 are given. 10,000 bootstrap replicates using the Dayhoff+Γ4 model to estimate the pairwise distances in the neighbor-joining analysis with MEGA4; 100 replicates in the maximum likelihood analysis with PhyML, 3 million generations with 1/3 discarded (burn-in) in the Bayesian analysis to estimate the posterior probability with MrBayes. The probabilistic methods were using the WAG+Γ4 model with four discrete gamma rate categories. This tree gives an overview of the diversity of the SEP sequences within all major lineages of Plantae. Blue, red and green colors indicate glaucophytes, red algae and algae with complex plastids, as well as green algae and land plants, respectively. This analysis is compatible with the assumption of several ancient, paralogous groups within the SEP subfamily. SEPs from red algae and SEP4 and SEP5 from land plants are reported for the first time. (B) Prediction of TM alpha helices in SEP sequences from a glaucophyte, a land plant and a red alga. The first of the two predicted TM helices comprises the CB motif. (C) Alignment of typical SEP sequences from all three major lineages of Plantae. The approximate positions of the predicted first and second TM helices are underlined with a green and a grey bar, respectively. Identical and similar amino acids are shown in white on black and grey backgrounds, respectively.

