ARGs-OAP v2.0 with an expanded SARG database and Hidden Markov Models for enhancement characterization and quantification of antibiotic resistance genes in environmental metagenomes

Autor: Tong Zhang, Li-Guan Li, Xiao-Tao Jiang, Benli Chai, Ying Yang, James M. Tiedje, James R. Cole, Xiaole Yin
Rok vydání: 2018
Předmět:
Zdroj: Bioinformatics. 34:2263-2270
ISSN: 1367-4811
1367-4803
Popis: Motivation Much global attention has been paid to antibiotic resistance in monitoring its emergence, accumulation and dissemination. For rapid characterization and quantification of antibiotic resistance genes (ARGs) in metagenomic datasets, an online analysis pipeline, ARGs-OAP has been developed consisting of a database termed Structured Antibiotic Resistance Genes (the SARG) with a hierarchical structure (ARGs type-subtype-reference sequence). Results The new release of the database, termed SARG version 2.0, contains sequences not only from CARD and ARDB databases, but also carefully selected and curated sequences from the latest protein collection of the NCBI-NR database, to keep up to date with the increasing number of ARG deposited sequences. SARG v2.0 has tripled the sequences of the first version and demonstrated improved coverage of ARGs detection in metagenomes from various environmental samples. In addition to annotation of high-throughput raw reads using a similarity search strategy, ARGs-OAP v2.0 now provides model-based identification of assembled sequences using SARGfam, a high-quality profile Hidden Markov Model (HMM), containing profiles of ARG subtypes. Additionally, ARGs-OAP v2.0 improves cell number quantification by using the average coverage of essential single copy marker genes, as an option in addition to the previous method based on the 16S rRNA gene. Availability and implementation ARGs-OAP can be accessed through http://smile.hku.hk/SARGs. The database could be downloaded from the same site. Source codes for this study can be downloaded from https://github.com/xiaole99/ARGs-OAP-v2.0. Supplementary information Supplementary data are available at Bioinformatics online.
Databáze: OpenAIRE