Fast and scalable querying of eukaryotic linear motifs with gget elm.

Autor: Luebbert L; Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, United States., Hoang C; California Institute of Technology, Pasadena, CA 91125, United States., Kumar M; Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany., Pachter L; Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, United States.; Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA 91125, United States.
Jazyk: angličtina
Zdroj: Bioinformatics (Oxford, England) [Bioinformatics] 2024 Mar 04; Vol. 40 (3).
DOI: 10.1093/bioinformatics/btae095
Abstrakt: Motivation: Eukaryotic linear motifs (ELMs), or Short Linear Motifs, are protein interaction modules that play an essential role in cellular processes and signaling networks and are often involved in diseases like cancer. The ELM database is a collection of manually curated motif knowledge from scientific papers. It has become a crucial resource for investigating motif biology and recognizing candidate ELMs in novel amino acid sequences. Users can search amino acid sequences or UniProt Accessions on the ELM resource web interface. However, as with many web services, there are limitations in the swift processing of large-scale queries through the ELM web interface or API calls, and, therefore, integration into protein function analysis pipelines is limited.
Results: To allow swift, large-scale motif analyses on protein sequences using ELMs curated in the ELM database, we have extended the gget suite of Python and command line tools with a new module, gget elm, which does not rely on the ELM server for efficiently finding candidate ELMs in user-submitted amino acid sequences and UniProt Accessions. gget elm increases accessibility to the information stored in the ELM database and allows scalable searches for motif-mediated interaction sites in the amino acid sequences.
Availability and Implementation: The manual and source code are available at https://github.com/pachterlab/gget.
(© The Author(s) 2024. Published by Oxford University Press.)
Databáze: MEDLINE