Please use this identifier to cite or link to this item: https://hdl.handle.net/10316/107998
Title: Post-Processing of Phylogenetic Trees On Islands, Clumps And (Non-)Effective Overlap
Authors: Silva, Ana Luísa de Almeida Serra Jorge
Orientador: Wilkinson, Mark
Pisani, Davide
Benton, Mike
Keywords: tree islands, phylogenetic trees, concatabominations
Issue Date: Jun-2022
Place of publication or event: University of Bristol
Abstract: Taxonomic instability in (multi)sets of phylogenetic trees is often caused by missing data, analytical artefacts and/or data incongruence due to homoplasy or loci with different evolutionary histories. This thesis focuses, primarily, on methods to subset and summarise heterogeneous (multi)sets of trees, and on an approach to mitigate the effects of non-effective overlap caused by non-random patterns of missing data. A generalised definition of tree islands to any tree-to-tree distance metric is provided, which allows these heterogeneous tree subsets to be easily identified from any tree distribution, and not just as a byproduct of heuristic parsimony tree searches. Expanding on earlier studies, partitioned-by-island, weighted- and rarefied-by-island-size consensus methods are proposed, and the effect of islands on topology-based taxonomic instability tests explored. An R package to extract islands from trees on the same leaf set, islandNeighbours, is described and applied to a Bayesian tree distribution. For trees on non-identical leaf sets, a new subsetting strategy based on tree-to-supertree distances, clumps of trees, is proposed and applied to multiple tree (multi)sets with the newly developed clumpy Python pipeline. An approach combining (gene-)tree jackknifing on matrix representation of splits with Concatabominations (a heuristic compatibility-based taxonomic instability test, Siu-Ting et al. 2015) is proposed to identify instances of non-effective overlap on a newly inferred caecilian Tree of Life, and also candidate loci for targeted taxon sampling with the aim of ameliorating taxonomic overlap. This approach is also compared to the mathematical gene sampling sufficiency approach. Lastly, a morphological dataset used to illustrate the presence and effects of islands, and the effects of focal tree choice on clumps, is thoroughly reanalysed and an easily implementable tool for comparison of branch support measures across trees with identical leaf sets described and illustrated with trees inferred from a hypothetical dataset.
Description: Documentos apresentados no âmbito do reconhecimento de graus e diplomas estrangeiros
URI: https://hdl.handle.net/10316/107998
Rights: openAccess
Appears in Collections:UC - Reconhecimento de graus e diplomas estrangeiros

Files in This Item:
File Description SizeFormat
Ana-Silva_Tese.pdfDocumentos apresentados no âmbito do reconhecimento de graus e diplomas estrangeiros6.36 MBAdobe PDFView/Open
Show full item record

Page view(s)

90
checked on Jul 17, 2024

Download(s)

102
checked on Jul 17, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.