Interrogating Genomic-Scale Data to Resolve Recalcitrant Nodes in the Spider Tree of Life.

Autor: Kulkarni S; Department of Biological Sciences, The George Washington University, Washington, DC.; Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC., Kallal RJ; Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC., Wood H; Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC., Dimitrov D; Department of Natural History, University Museum of Bergen, University of Bergen, Bergen, Norway., Giribet G; Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA., Hormiga G; Department of Biological Sciences, The George Washington University, Washington, DC.
Jazyk: angličtina
Zdroj: Molecular biology and evolution [Mol Biol Evol] 2021 Mar 09; Vol. 38 (3), pp. 891-903.
DOI: 10.1093/molbev/msaa251
Abstrakt: Genome-scale data sets are converging on robust, stable phylogenetic hypotheses for many lineages; however, some nodes have shown disagreement across classes of data. We use spiders (Araneae) as a system to identify the causes of incongruence in phylogenetic signal between three classes of data: exons (as in phylotranscriptomics), noncoding regions (included in ultraconserved elements [UCE] analyses), and a combination of both (as in UCE analyses). Gene orthologs, coded as amino acids and nucleotides (with and without third codon positions), were generated by querying published transcriptomes for UCEs, recovering 1,931 UCE loci (codingUCEs). We expected that congeners represented in the codingUCE and UCEs data would form clades in the presence of phylogenetic signal. Noncoding regions derived from UCE sequences were recovered to test the stability of relationships. Phylogenetic relationships resulting from all analyses were largely congruent. All nucleotide data sets from transcriptomes, UCEs, or a combination of both recovered similar topologies in contrast with results from transcriptomes analyzed as amino acids. Most relationships inferred from low-occupancy data sets, containing several hundreds of loci, were congruent across Araneae, as opposed to high occupancy data matrices with fewer loci, which showed more variation. Furthermore, we found that low-occupancy data sets analyzed as nucleotides (as is typical of UCE data sets) can result in more congruent relationships than high occupancy data sets analyzed as amino acids (as in phylotranscriptomics). Thus, omitting data, through amino acid translation or via retention of only high occupancy loci, may have a deleterious effect in phylogenetic reconstruction.
(© The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
Databáze: MEDLINE