Popis: |
Abstract Environmental DNA (eDNA) is revolutionizing species monitoring in nature. At the heart of any eDNA approach is the reliance upon sufficient DNA sequence information to satisfy the demands of eDNA assay specificity and sensitivity. The most common source of this information has been restricted to short barcoding regions of the mitochondrial genome (mitogenome) and marker genes. The use of these limited regions for assay design has often resulted in substantial trade‐offs in assay performance. With increased accessibility of full mitogenome assemblies, the potential for designing more robust eDNA assays is considerably enhanced. However, this also poses a new challenge to effectively identify suitable regions for assay design using considerably larger sequences. We present unikseq, a utility that uses words of length k (k‐mers) to identify unique regions in a reference sequence relative to tolerated (ingroup) and not‐tolerated (outgroup or non‐target) sequence sets, quickly and with low memory that can yield highly specific assays. We illustrate its application within an assay development workflow through use‐case examples for the design and validation of four quantitative real‐time polymerase chain reaction (qPCR)‐based assays selective for American bullfrog (Rana [Lithobates] catesbeiana), Burbot (Lota lota), Lake trout (Salvelinus namaycush), and Quillback rockfish (Sebastes maliger). The chosen target species vary in range, habitat, and degree of relatedness to their sympatric species that, consequently, impact eDNA assay design difficulty. We demonstrate the effectiveness of unikseq through assay validation and characterization using DNA from voucher specimens, synthetic DNA, and, where possible, field samples, to verify the specificity and sensitivity of the newly designed assays. By facilitating whole mitogenome sequence comparison, the creation of high‐performing eDNA assays is substantially enhanced. Having several adjustable parameters for specifying user requirements within unikseq, this approach can facilitate the identification of suitable regions for a broad range of applications requiring nucleotide sequence comparisons. |