To what extent can one gene (or a handful of them) affect a phylogeny? This paper suggest that even in very large data matrices the resolution of some branches can rely on tiny subsets of data. They show this to be the case in several contentious nodes of plant, animal and fungi data matrices and suggest a framework for quantifying the phylogenetic signal in such difficult cases.
They also think that humans are more closely related to sponges than ctenophores, which is cool.
Darwin, 9th of May, 10.00. Blueberry pie to compensate the phylogeny topic.
Phylogenomic studies have resolved countless branches of the tree of life, but remain strongly contradictory on certain, contentious relationships. Here, we use a maximum likelihood framework to quantify the distribution of phylogenetic signal among genes and sites for 17 contentious branches and 6 well-established control branches in plant, animal and fungal phylogenomic data matrices. We find that resolution in some of these 17 branches rests on a single gene or a few sites, and that removal of a single gene in concatenation analyses or a single site from every gene in coalescence-based analyses diminishes support and can alter the inferred topology. These results suggest that tiny subsets of very large data matrices drive the resolution of specific internodes, providing a dissection of the distribution of support and observed incongruence in phylogenomic analyses. We submit that quantifying the distribution of phylogenetic signal in phylogenomic data is essential for evaluating whether branches, especially contentious ones, are truly resolved. Finally, we offer one detailed example of such an evaluation for the controversy regarding the earliest-branching metazoan phylum, for which examination of the distributions of gene-wise and site-wise phylogenetic signal across eight data matrices consistently supports ctenophores as the sister group to all other metazoans.