Autor: |
Monroe LK; Bioscience Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA., Truong DP; Theoretical Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA., Miner JC; Bioscience Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA., Adikari SH; Bioscience Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA., Sasiene ZJ; Bioscience Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA., Fenimore PW; Theoretical Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA., Alexandrov B; Theoretical Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA., Williams RF; Bioscience Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA., Nguyen HB; Bioscience Division, MS M888, Los Alamos National Laboratory, Los Alamos, NM 87545, USA. |
Abstrakt: |
Conotoxins are toxic, disulfide-bond-rich peptides from cone snail venom that target a wide range of receptors and ion channels with multiple pathophysiological effects. Conotoxins have extraordinary potential for medical therapeutics that include cancer, microbial infections, epilepsy, autoimmune diseases, neurological conditions, and cardiovascular disorders. Despite the potential for these compounds in novel therapeutic treatment development, the process of identifying and characterizing the toxicities of conotoxins is difficult, costly, and time-consuming. This challenge requires a series of diverse, complex, and labor-intensive biological, toxicological, and analytical techniques for effective characterization. While recent attempts, using machine learning based solely on primary amino acid sequences to predict biological toxins (e.g., conotoxins and animal venoms), have improved toxin identification, these methods are limited due to peptide conformational flexibility and the high frequency of cysteines present in toxin sequences. This results in an enumerable set of disulfide-bridged foldamers with different conformations of the same primary amino acid sequence that affect function and toxicity levels. Consequently, a given peptide may be toxic when its cysteine residues form a particular disulfide-bond pattern, while alternative bonding patterns (isoforms) or its reduced form (free cysteines with no disulfide bridges) may have little or no toxicological effects. Similarly, the same disulfide-bond pattern may be possible for other peptide sequences and result in different conformations that all exhibit varying toxicities to the same receptor or to different receptors. We present here new features, when combined with primary sequence features to train machine learning algorithms to predict conotoxins, that significantly increase prediction accuracy. |