ADOPT: intrinsic protein disorder prediction through deep bidirectional transformers.

Autor: Redl I; Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK., Fisicaro C; Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK., Dutton O; Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK., Hoffmann F; Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK., Henderson L; Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK., Owens BMJ; Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK., Heberling M; Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK., Paci E; Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK.; Department of Physics and Astronomy 'Augusto Righi', University of Bologna, 40127 Bologna, Italy., Tamiola K; Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK.
Jazyk: angličtina
Zdroj: NAR genomics and bioinformatics [NAR Genom Bioinform] 2023 May 01; Vol. 5 (2), pp. lqad041. Date of Electronic Publication: 2023 May 01 (Print Publication: 2023).
DOI: 10.1093/nargab/lqad041
Abstrakt: Intrinsically disordered proteins (IDPs) are important for a broad range of biological functions and are involved in many diseases. An understanding of intrinsic disorder is key to develop compounds that target IDPs. Experimental characterization of IDPs is hindered by the very fact that they are highly dynamic. Computational methods that predict disorder from the amino acid sequence have been proposed. Here, we present ADOPT (Attention DisOrder PredicTor), a new predictor of protein disorder. ADOPT is composed of a self-supervised encoder and a supervised disorder predictor. The former is based on a deep bidirectional transformer, which extracts dense residue-level representations from Facebook's Evolutionary Scale Modeling library. The latter uses a database of nuclear magnetic resonance chemical shifts, constructed to ensure balanced amounts of disordered and ordered residues, as a training and a test dataset for protein disorder. ADOPT predicts whether a protein or a specific region is disordered with better performance than the best existing predictors and faster than most other proposed methods (a few seconds per sequence). We identify the features that are relevant for the prediction performance and show that good performance can already be gained with <100 features. ADOPT is available as a stand-alone package at https://github.com/PeptoneLtd/ADOPT and as a web server at https://adopt.peptone.io/.
(© The Author(s) 2023. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.)
Databáze: MEDLINE