ProtTox: Toxin identification from Protein Sequences

Autor: Naren Ramakrishnan, Mohammad Raihanul Islam, Debanjan Datta, Andrew S. Warren, Patrick Butler, Sathappan Muthiah
Jazyk: angličtina
Rok vydání: 2020
Předmět:
DOI: 10.1101/2020.04.18.048439
Popis: Toxin classification of protein sequences is a challenging task with real world applications in healthcare and synthetic biology. Due to an ever expanding database of proteins and the inordinate cost of manual annotation, automated machine learning based approaches are crucial. Approaches need to overcome challenges of homology, multi-functionality, and structural diversity among proteins in this task. We propose a novel deep learning based method ProtTox, that aims to address some of the shortcomings of previous approaches in classifying proteins as toxins or not. Our method achieves a performance of 0.812 F1-score which is about 5% higher than the closest performing baseline.
Databáze: OpenAIRE