
Origins of major archaeal clades correspond to gene acquisitions from bacteria
- Select a language for the TTS:
- UK English Female
- UK English Male
- US English Female
- US English Male
- Australian Female
- Australian Male
- Language selected: (auto detect) - EN
Play all audios:
ABSTRACT The mechanisms that underlie the origin of major prokaryotic groups are poorly understood. In principle, the origin of both species and higher taxa among prokaryotes should entail
similar mechanisms—ecological interactions with the environment paired with natural genetic variation involving lineage-specific gene innovations and lineage-specific gene
acquisitions1,2,3,4. To investigate the origin of higher taxa in archaea, we have determined gene distributions and gene phylogenies for the 267,568 protein-coding genes of 134 sequenced
archaeal genomes in the context of their homologues from 1,847 reference bacterial genomes. Archaeal-specific gene families define 13 traditionally recognized archaeal higher taxa in our
sample. Here we report that the origins of these 13 groups unexpectedly correspond to 2,264 group-specific gene acquisitions from bacteria. Interdomain gene transfer is highly asymmetric,
transfers from bacteria to archaea are more than fivefold more frequent than vice versa. Gene transfers identified at major evolutionary transitions among prokaryotes specifically implicate
gene acquisitions for metabolic functions from bacteria as key innovations in the origin of higher archaeal taxa. Access through your institution Buy or subscribe This is a preview of
subscription content, access via your institution ACCESS OPTIONS Access through your institution Subscribe to this journal Receive 51 print issues and online access $199.00 per year only
$3.90 per issue Learn more Buy this article * Purchase on SpringerLink * Instant access to full article PDF Buy now Prices may be subject to local taxes which are calculated during checkout
ADDITIONAL ACCESS OPTIONS: * Log in * Learn about institutional subscriptions * Read our FAQs * Contact customer support SIMILAR CONTENT BEING VIEWED BY OTHERS DIVERGENT GENOMIC TRAJECTORIES
PREDATE THE ORIGIN OF ANIMALS AND FUNGI Article Open access 24 August 2022 INFERENCE AND RECONSTRUCTION OF THE HEIMDALLARCHAEIAL ANCESTRY OF EUKARYOTES Article Open access 14 June 2023
EXPANDED DIVERSITY OF ASGARD ARCHAEA AND THEIR RELATIONSHIPS WITH EUKARYOTES Article 28 April 2021 REFERENCES * Doolittle, W. F. & Papke, R. T. Genomics and the bacterial species
problem. _Genome Biol._ 7, 116 (2006) Article Google Scholar * Retchless, A. C. & Lawrence, J. G. Temporal fragmentation of speciation in bacteria. _Science_ 317, 1093–1096 (2007)
Article ADS CAS Google Scholar * Achtman, M. & Wagner, M. Microbial diversity and the genetic nature of microbial species. _Nature Rev. Microbiol._ 6, 431–440 (2008) Article CAS
Google Scholar * Fraser, C., Alm, E. J., Polz, M. F., Spratt, B. G. & Hanage, W. P. The bacterial species challenge: making sense of genetic and ecological diversity. _Science_ 323,
741–746 (2009) Article ADS CAS Google Scholar * Puigbò, P., Wolf, Y. I. & Koonin, E. V. The tree and net components of prokaryote genome evolution. _Genome Biol. Evol._ 2, 745–756
(2010) Article Google Scholar * Dagan, T. Phylogenomic networks. _Trends Microbiol._ 19, 483–491 (2011) Article CAS Google Scholar * Hess, W. R. Genome analysis of marine photosynthetic
microbes and their global role. _Curr. Opin. Biotechnol._ 15, 191–198 (2004) Article CAS Google Scholar * Kloesges, T. et al. Networks of gene sharing among 329 proteobacterial genomes
reveal differences in lateral gene transfer frequency at different phylogenetic depths. _Mol. Biol. Evol._ 28, 1057–1074 (2011) Article CAS Google Scholar * Williams, D., Gogarten, J. P.
& Papke, R. T. Quantifying homologous replacement of loci between haloarchaeal species. _Genome Biol. Evol._ 4, 1223–1244 (2012) Article Google Scholar * Woese, C. R. Bacterial
evolution. _Microbiol. Rev._ 51, 221–271 (1987) CAS PubMed PubMed Central Google Scholar * Rivera, M. C., Jain, R., Moore, J. E. & Lake, J. A. Genomic evidence for two functionally
distinct gene classes. _Proc. Natl Acad. Sci. USA_ 95, 6239–6244 (1998) Article ADS CAS Google Scholar * Puigbò, P., Wolf, Y. I. & Koonin, E. V. Search for a tree of life in the
thicket of the phylogenetic forest. _J. Biol._ 8, 59 (2009) Article Google Scholar * Brochier-Armanet, C., Forterre, P. & Gribaldo, S. Phylogeny and evolution of the Archaea: one
hundred genomes later. _Curr. Opin. Microbiol._ 14, 274–281 (2011) Article Google Scholar * Lake, J. A. & Rivera, M. C. Deriving the genomic tree of life in the presence of horizontal
gene transfer: conditioned reconstruction. _Mol. Biol. Evol._ 21, 681–690 (2004) Article CAS Google Scholar * Enright, A. J., Van Dongen, S. & Ouzounis, C. A. An efficient algorithm
for large-scale detection of protein families. _Nucleic Acids Res._ 30, 1575–1584 (2002) Article CAS Google Scholar * Wolf, Y. I., Makarova, K. S., Yutin, N. & Koonin, E. V. Updated
clusters of orthologous genes for Archaea: a complex ancestor of the Archaea and the byways of horizontal gene transfer. _Biol. Direct_ 7, 46 (2012) Article CAS Google Scholar *
Nelson-Sathi, S. et al. Acquisitions of 1,000 eubacterial genes physiologically transformed a methanogen at the origin of Haloarchaea. _Proc. Natl Acad. Sci. USA_ 109, 20537–20542 (2012)
Article ADS CAS Google Scholar * Bräsen, C., Esser, D., Rauch, B. & Siebers, B. Carbohydrate metabolism in Archaea: current insights into unusual enzymes and pathways and their
regulation. _Microbiol. Mol. Biol. Rev._ 78, 89–175 (2014) Article Google Scholar * Siebers, B. & Schönheit, P. Unusual pathways and enzymes of central carbohydrate metabolism in
Archaea. _Curr. Opin. Microbiol._ 8, 695–705 (2005) Article CAS Google Scholar * Doolittle, W. F. & Bapteste, E. Pattern pluralism and the tree of life hypothesis. _Proc. Natl Acad.
Sci. USA_ 104, 2043–2049 (2007) Article ADS CAS Google Scholar * Creevey, C. J. et al. Does a tree-like phylogeny only exist at the tips in the tree of prokaryotes? _Proc. R. Soc. Lond.
B_ 271, 2551–2558 (2004) Article CAS Google Scholar * Deppenmeier, U. et al. The genome of _Methanosarcina mazei_: evidence for lateral gene transfer between bacteria and archaea. _J.
Mol. Microbiol. Biotechnol._ 4, 453–461 (2002) CAS Google Scholar * Williams, T. A., Foster, G. F., Cox, C. Y. & Embley, T. M. An archaeal origin of eukaryotes supports only two
primary domains of life. _Nature_ 504, 231–236 (2013) Article ADS CAS Google Scholar * McInerney, J. O., O’Connell, M. J. & Pisani, D. The hybrid nature of eukaryota and a consilient
view of life on Earth. _Nature Rev. Microbiol._ 12, 449–455 (2014) Article CAS Google Scholar * Wolf, Y. I. & Koonin, E. V. Genome reduction as the dominant mode of evolution.
_Bioessays_ 35, 829–837 (2013) Article Google Scholar * Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. _Nucleic Acids Res._ 25,
3389–3402 (1997) Article CAS Google Scholar * Tatusov, R. L., Koonin, E. V. & Lipman, D. J. A genomic perspective on protein families. _Science_ 278, 631–637 (1997) Article ADS CAS
Google Scholar * Rice, P., Longden, I. & Bleasby, A. EMBOSS: the European molecular biology open software suite. _Trends Genet._ 16, 276–277 (2000) Article CAS Google Scholar *
Guindon, S. & Gascuel, O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. _Syst. Biol._ 52, 696–704 (2003) Article Google Scholar *
Stamatakis, A., Ludwig, T. & Meier, H. RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees. _Bioinformatics_ 21, 456–463 (2005) Article CAS
Google Scholar Download references ACKNOWLEDGEMENTS We gratefully acknowledge funding from European Research Council (ERC 232975 to W.F.M.), the graduate school E-Norm of the Heinrich-Heine
University (W.F.M.), the DFG (Scho 316/11-1 to P.S.; SI 642/10-1 to B.S.), and BMBF (0316188A, B.S.). G.L. is supported by an ERC grant (281357 to Tal Dagan), D.B. thanks the Alexander von
Humbold Foundation for a Fellowship. Computational support of the Zentrum für Informations- und Medientechnologie (ZIM) at the Heinrich-Heine University is gratefully acknowledged. AUTHOR
INFORMATION AUTHORS AND AFFILIATIONS * Institute of Molecular Evolution, Heinrich-Heine University, 40225 Düsseldorf, Germany , Shijulal Nelson-Sathi, Filipa L. Sousa, Mayo Roettger, Nabor
Lozada-Chávez, Thorsten Thiergart & William F. Martin * Mathematisches Institut, Heinrich-Heine University, 40225 Düsseldorf, Germany , Arnold Janssen * Department of Mathematics and
Statistics, University of Otago, Dunedin 9054, New Zealand, David Bryant * Genomic Microbiology Group, Institute of Microbiology, Christian-Albrechts-Universität Kiel, 24118 Kiel, Germany ,
Giddy Landan * Institut für Allgemeine Mikrobiologie, Christian-Albrechts-Universität Kiel, 24118 Kiel, Germany , Peter Schönheit * Faculty of Chemistry, Biofilm Centre, Molecular Enzyme
Technology and Biochemistry, University of Duisburg-Essen, 45117 Essen, Germany , Bettina Siebers * Department of Biology, National University of Ireland, Maynooth, County Kildare, Ireland,
James O. McInerney * Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa, 2780-157 Oeiras, Portugal , William F. Martin Authors * Shijulal Nelson-Sathi View author
publications You can also search for this author inPubMed Google Scholar * Filipa L. Sousa View author publications You can also search for this author inPubMed Google Scholar * Mayo
Roettger View author publications You can also search for this author inPubMed Google Scholar * Nabor Lozada-Chávez View author publications You can also search for this author inPubMed
Google Scholar * Thorsten Thiergart View author publications You can also search for this author inPubMed Google Scholar * Arnold Janssen View author publications You can also search for
this author inPubMed Google Scholar * David Bryant View author publications You can also search for this author inPubMed Google Scholar * Giddy Landan View author publications You can also
search for this author inPubMed Google Scholar * Peter Schönheit View author publications You can also search for this author inPubMed Google Scholar * Bettina Siebers View author
publications You can also search for this author inPubMed Google Scholar * James O. McInerney View author publications You can also search for this author inPubMed Google Scholar * William
F. Martin View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS S.N.-S., F.L.S., M.R., N.L.-C. and T.T. performed bioinformatic analyses; A.J.,
D.B. and G.L. performed statistical analyses; P.S., B.S., J.O.M. and W.F.M. interpreted results; S.N.-S., F.L.S., G.L., J.O.M. and W.F.M. wrote the paper; S.N.-S., G.L. and W.F.M. designed
the study. All authors discussed the results and commented on the manuscript. CORRESPONDING AUTHOR Correspondence to William F. Martin. ETHICS DECLARATIONS COMPETING INTERESTS The authors
declare no competing financial interests. EXTENDED DATA FIGURES AND TABLES EXTENDED DATA FIGURE 1 INTER-DOMAIN GENE SHARING NETWORK. Each cell in the matrix indicates the number of genes
(_e_-value ≤ 10−10 and ≥ 25% global identity) shared between 134 archaeal and 1,847 bacterial genomes in each pairwise inter-domain comparison (scale bar at lower right). Archaeal genomes
are listed as in Fig. 1. Bacterial genomes are presented in 23 groups corresponding to phylum or class in the GenBank nomenclature: _a_ = Clostridia; _b_ = Erysipelotrichi, Negativicutes;
_c_ = Bacilli; _d_ = Firmicutes; _e_ = Chlamydia; _f_ = Verrucomicrobia, Planctomycete; _g_ = Spirochaete; _h_ = Gemmatimonadetes, Synergisteles, Elusimicrobia, Dyctyoglomi, Nitrospirae; _i_
= Actinobacteria; _j_ = Fibrobacter, Chlorobi; _k_ = Bacteroidetes; _l_ = Fusobacteria; Thermatogae, Aquificae, Chloroflexi; _m_ = Deinococcus-Thermus; _n_ = Cyanobacteria; _o_ =
Acidobacteria; δ, ε, α, β, γ = Delta, Epsilon, Alpha, Beta and Gamma proteobacteria; _P_ = Thermosulfurobateria, Caldiserica, Chysiogenete, Ignavibacteria. Bacterial genome size in number of
proteins is indicated at the top. EXTENDED DATA FIGURE 2 PRESENCE–ABSENCE PATTERNS OF ARCHAEAL GENES WITH SPARSE DISTRIBUTION AMONG BACTERIA SAMPLED. Archaeal export families are sorted
according to the reference tree on the left. The figure shows the 391 cases of archaea-to-bacteria export (≥ 2 archaea and ≥ 2 bacteria from one phylum only), 662 cases of bacterial
singleton trees (≥ 3 archaea, one bacterium). The 25,762 clusters were classified into the following categories (Supplementary Table 2): 16,983 archaeal specific, 3,315 imports, 391 exports,
662 cases of bacterial singletons with ≥ 3 archaea in the tree, 308 cases with three sequences (a bacterial singleton and 2 archaea) in the cluster, 4,074 trees in which archaea were
non-monophyletic, and 29 ambiguous cases among trees showing archaeal monophyly. The bacterial taxonomic distribution is shown in the lower panel. Gene identifiers and trees are given in
Supplementary Table 3. EXTENDED DATA FIGURE 3 COMPARISON OF SETS OF TREES FOR SINGLE-COPY GENES IN 11 ARCHAEAL GROUPS. Cumulative distribution functions for scores of tree compatibility with
the recipient data set. Values are _P_ values of the two-sided Kolmogorov–Smirnov (KS) two-sample goodness-of-fit test in the comparison of the recipient (blue) data sets against the
imports (green) data set and three synthetic data sets, one-LGT (red), two-LGT (pink) and random (cyan). A, Thermoproteales. B, Desulfurococcales. C, Sulfolobales. D, Thermococcales. E,
Methanobacteriales. F, Methanococcales. G, Thermoplasmatales. H, Archaeoglobales. I, Methanococcales. J, Methanosarcinales. K, Haloarchaea. EXTENDED DATA FIGURE 4 PRESENCE–ABSENCE PATTERNS
OF ALL ARCHAEAL NON-MONOPHYLETIC GENES. Archaeal families that did not generate monophyly for archaeal sequences in ML trees are plotted according the reference tree on the left, the
distribution across bacterial genomes groups is shown in the lower panel. These trees include 693 cases in which archaea showed non-monophyly by the misplacement of a single archaeal branch.
Gene identifiers and trees are given in Supplementary Tables 4 and 5. EXTENDED DATA FIGURE 5 SORTING BY BACTERIAL PRESENCE ABSENCE PATTERNS FOR ARCHAEAL IMPORTS, EXPORTS AND ARCHAEAL
NON-MONOPHYLETIC FAMILIES. Archaeal families and their homologue distribution in 1,847 bacterial genomes are sorted by archaeal (top) and bacterial (bottom) gene distributions for direct
comparison. A–F, Distributions of archaeal imports sorted by archaeal groups (A) and by bacterial groups (B); distributions of archaeal exports sorted by archaeal groups (C) and by bacterial
groups (D); distributions of archaeal non-monophyletic gene families sorted by archaeal groups (E) and by bacterial groups (F). EXTENDED DATA FIGURE 6 TESTING FOR EVIDENCE OF HIGHER ORDER
ARCHAEAL RELATIONSHIPS USING A PERMUTATION TAIL PROBABILITY (PTP) TEST. Comparison of pairwise Euclidian distance distributions between archaeal real and conditional random gene family
patterns using the two-sided Kolmogorov-Smirnov (KS) two-sample goodness-of-fit test. A, Archaeal specific families: distribution of 2,471 archaeal specific families present in at least 2
and less than 11 groups (top); comparison between real data and 100 conditional random patterns generated by shuffling the entries within Crenarchaeota and Euryarchaeota separately;
comparison between real data and conditional random patterns generated by including others (Nanoarchaea, Thaumarchaea and Korarchaeota) into Crenarchaeota (mean _P = _ 0.0071, middle) or
into Euryarchaeota (mean _P = _ 0.02591, bottom). B, Archaeal import families: distribution of 989 archaeal import families present in at least 2 and less than 11 groups (top). Comparison
between real data and 100 conditional random patterns generated by shuffling the entries within Crenarchaeota and Euryarchaeota separately by including others (Nanoarchaea, Thaumarchaea and
Korarchaeota) into Crenarchaeota (mean _P = _ 0.0795, middle); comparison between real data and random patterns generated by including others (Nanoarchaea,Thaumarchaea and Korarchaeota) into
Euryarchaeota (mean _P = _ 0.0098, bottom). EXTENDED DATA FIGURE 7 ARCHAEAL SPECIFIC AND IMPORT GENE COUNTS ON A REFERENCE TREE. Number of archaeal specific and import families
corresponding to each node in the reference tree are shown in the order of ‘specific/imports’. Numbers at internal nodes indicate the number of archaeal-specific families and families with
bacterial homologues that correspond to the reference tree topology. Values at the far left indicate the number of archaeal-specific families and families with bacterial homologues that are
present in all archaeal groups. EXTENDED DATA FIGURE 8 NON TREE-LIKE STRUCTURE OF ARCHAEAL PROTEIN FAMILIES. Proportion of archaeal families whose distributions are congruent with the
reference tree and with all possible trees. Filled circles indicate the proportion of archaeal families that are congruent to the reference tree allowing no losses (with a single origin) and
different increments of losses allowed. Red, blue, green, magenta and black circles represent the proportion of families that can be explained using a single origin (849, 11.5%), single
origin plus 1 loss (22.4%), single origin plus 2 losses (15%), single origin plus 3 losses (13%) and single origin plus ≥ 4 losses (38%) respectively. Lines indicate the proportion of
families that can be explained by each of the 6,081,075 possible trees that preserve euryarchaeote and crenarchaeote monophyly. Note that on average, any given tree can explain 569 (8%) of
the archaeal families using a single origin event in the tree, and the best tree can explain only 1,180 families (16%). In the present data, 208,019 trees explain the gene distributions
better than the archaeal reference tree without loss events, underscoring the discordance between core gene phylogeny and gene distributions in the remainder of the genome. SUPPLEMENTARY
INFORMATION SUPPLEMENTARY INFORMATION This file contains Supplementary Methods and Supplementary References. (PDF 728 kb) SUPPLEMENTARY DATA This file contains Supplementary Tables 1-8 and a
Supplementary Table Guide. (ZIP 32480 kb) POWERPOINT SLIDES POWERPOINT SLIDE FOR FIG. 1 POWERPOINT SLIDE FOR FIG. 2 POWERPOINT SLIDE FOR FIG. 3 SOURCE DATA SOURCE DATA TO FIG. 1 SOURCE DATA
TO FIG. 2 SOURCE DATA TO FIG. 3 RIGHTS AND PERMISSIONS Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Nelson-Sathi, S., Sousa, F., Roettger, M. _et al._ Origins of major
archaeal clades correspond to gene acquisitions from bacteria. _Nature_ 517, 77–80 (2015). https://doi.org/10.1038/nature13805 Download citation * Received: 04 June 2014 * Accepted: 28
August 2014 * Published: 15 October 2014 * Issue Date: 01 January 2015 * DOI: https://doi.org/10.1038/nature13805 SHARE THIS ARTICLE Anyone you share the following link with will be able to
read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing
initiative