Popis: |
University of Technology Sydney. Faculty of Science. Recombination detection is a critical step when analysing viral sequencing data. When unaccounted for, recombination has the potential to mislead estimations of the evolutionary history and relationships between viruses. A repertoire of recombination detection methods have been developed over the past two decades, but their ability to process increasingly large viral datasets is unclear. In this thesis, five recombination detection methods (PhiPack (Profile), 3SEQ, GENECONV, VSEARCH (UCHIME), and gmos) are evaluated to determine if any are suitable for the analysis of bulk next-generation sequencing data. Analysis of datasets simulated across a wide range of mutation and recombination rates, and three empirical datasets, revealed that the assessed recombination detection methods may not be scalable, nor robust, for the analysis of bulk next-generation sequencing data. In particular, the most scalable methods VSEARCH (UCHIME) and gmos may not be suitable due to respective technical limitations. Overall, no single recombination detection method is suited for the analysis of all types of viral sequencing data, and the critical trade-offs between the methods are outlined. Recombination detection remains a complex problem. Continual evaluation of detection methods, particularly novel approaches, should be conducted to identify both scalable and robust methods to meet the need for the rapid analysis of bulk viral sequencing data. |