Precise detection of Acrs in prokaryotes using only six features

Autor: Chuan Dong, Xin Wang, Dong-Kai Pu, Qing-Feng Wen, Cong Ma, Feng-Biao Guo, Zhi Zeng
Rok vydání: 2020
Předmět:
DOI: 10.1101/2020.05.23.112011
Popis: Anti-CRISPR proteins (Acrs) can suppress the activity of CRISPR-Cas systems. Some viruses depend on Acrs to expand their genetic materials into the host genome which can promote species diversity. Therefore, the identification and determination of Acrs are of vital importance. In this work we developed a random forest tree-based tool, AcrDetector, to identify Acrs in the whole genomescale using merely six features. AcrDetector can achieve a mean accuracy of 99.65%, a mean recall of 75.84%, a mean precision of 99.24% and a mean F1 score of 85.97%; in multi-round, 5-fold cross-validation (30 different random states). To demonstrate that AcrDetector can identify real Acrs precisely at the whole genome-scale we performed a cross-species validation which resulted in 71.43% of real Acrs being ranked in the top 10. We applied AcrDetector to detect Acrs in the latest data. It can accurately identify 3 Acrs, which have previously been verified experimentally. A standalone version of AcrDetector is available at https://github.com/RiversDong/AcrDetector. Additionally, our result showed that most of the Acrs are transferred into their host genomes in a recent stage rather than early.
Databáze: OpenAIRE