Squigulator: simulation of nanopore sequencing signal data with tunable noise parameters

Autor: Hasindu Gamaarachchi, James M. Ferguson, Hiruna Samarakoon, Kisaru Liyanage, Ira W. Deveson
Rok vydání: 2023
DOI: 10.1101/2023.05.09.539953
Popis: In silicosimulation of next-generation sequencing data is a technique used widely in the genomics field. However, there is currently a lack of optimal tools for creating simulated data from ‘third-generation’ nanopore sequencing devices, which measure DNA or RNA molecules in the form of time-series current signal data. Here, we introduceSquigulator, a fast and simple tool for simulation of realistic nanopore signal data.Squigulatortakes a reference genome, transcriptome or read sequences and generates corresponding raw nanopore signal data. This is compatible with basecalling software from Oxford Nanopore Technologies (ONT) and other third-party tools, thereby providing a useful substrate for testing, debugging, validation and optimisation of nanopore analysis methods. The user may generate noise-free ‘ideal’ data, realistic data with noise profiles emulating specific ONT protocols, or they may deterministically modify noise parameters and other variables to shape the data to their needs. To highlight its utility, we useSquigulatorto model the degree to which different types of noise impact the accuracy of ONT basecalling and downstream variant detection, revealing new insights into the properties of ONT data. We provideSquigulatoras an open-source tool for the nanopore community:https://github.com/hasindu2008/squigulator
Databáze: OpenAIRE