A survey of mapping algorithms in the long-reads era

Autor: Kristoffer Sahlin, Thomas Baudeau, Bastien Cazaux, Camille Marchet
Přispěvatelé: Department of Mathematics [Stockholm, Royal Institute of Technology [Stockholm] (KTH ), Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL), Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS), ANR-21-CE45-0034,INSSANE,Modélisation Structurale d'ARN Intégrant des Données de Séquençage(2021)
Rok vydání: 2022
Předmět:
Popis: It has been ten years since the first publication of a method dedicated entirely to mapping third-generation sequencing long-reads. The unprecedented characteristics of this new type of sequencing data created a shift, and methods moved on from the seed-and-extend framework previously used for short reads to a seed-and-chain framework due to the abundance of seeds in each read. As a result, the main novelties in proposed long-read mapping algorithms are typically based on alternative seed constructs or chaining formulations. Dozens of tools now exist, whose heuristics have considerably evolved with time. The rapid progress of the field, synchronized with the frequent improvements of data, does not make the literature and implementations easy to keep up with. Therefore, in this survey article, we provide an overview of existing mapping methods for long reads with accessible insights into methods. Since mapping is also very driven by the implementations themselves, we join an original visualization tool to understand the parameter settings (http://bcazaux.polytech-lille.net/Minimap2/) for the chaining part.
Databáze: OpenAIRE