Popis: |
Additional file 1: Fig. S1. IGV screenshot of three representative BAM files of Iso-Seq reads aligned to the reference genome, in which supplementary alignments are hidden. Fig. S2. The precision-recall plot of DeepVariant (DV)-based pipelines on Iso-Seq data (PacBio lrRNA-seq), for each dataset (Jurkat or WTC-11), and separated by variant types (indels or SNPs). Fig. S3. Relationship between the proportion of N-cigar (i.e., intron-containing) reads and Iso-Seq read coverage (WTC-11 dataset). Fig. S4. Precision-recall plots of DeepVariant (DV)-based pipelines for variant calling from Iso-Seq data (PacBio lrRNA-seq), according to the proportion of intron-containing (N-cigar) reads (point sizes). Fig. S5. The precision-recall plot when using both pileup and full-alignment models of Clair3-based pipelines on Iso-Seq data (PacBio lrRNA-seq), for each dataset (Jurkat or WTC-11), and separated by variant types (indels or SNPs). Fig. S6. The precision-recall plot when using pileup-only model of Clair3-based pipelines on Iso-Seq data (PacBio lrRNA-seq), for each dataset (Jurkat or WTC-11), and separated by variant types (indels or SNPs). Fig. S7. The precision-recall plot of the SNCR+NanoCaller pipeline on Iso-Seq data (PacBio lrRNA-seq), for each dataset (Jurkat or WTC-11), and separated by variant types (indels or SNPs). Fig. S8. Variant calling performance on Nanopore lrRNA-seq data. Fig. S9. Variant calling performance on Illumina RNA-seq data. Table S1. Number of true indels and SNPs covered by Iso-Seq data, in each read coverage range used in the mini-benchmark, for Jurkat and WTC-11 datasets. Table S2. Performance measures (precision, recall, and F1 score) of the best tested pipelines (SNCR+flagCorrection+DeepVariant, Clair3-mix, and SNCR+GATK), for each dataset (Jurkat and WTC-11), separated by variant types (indels and SNPs), using different thresholds for minimum Iso-Seq read coverage (Min_coverage). Table S3. Number of true indels and SNPs covered by Nanopore lrRNA-seq data, in each read coverage range used in the mini-benchmark, for WTC-11 dataset. Table S4. Number of true indels and SNPs covered by Illumina RNANanopore lrRNA-seq data, in each read coverage range used in the mini-benchmark, for WTC-11 dataset. |