Automated Design of Dynamic Programming Schemes for RNA Folding with Pseudoknots

Autor: Marchand, Bertrand, Will, Sebastian, Berkemer, Sarah J., Bulteau, Laurent, Ponty, Yann
Přispěvatelé: Marchand, Bertrand, Décrypter les architectures complexes d'ARN par sondage et interactions - - PaRNAssus2019 - ANR-19-CE45-0023 - AAPG2019 - VALID, Algorithms and Models for Integrative BIOlogy (AMIBIO), Laboratoire d'informatique de l'École polytechnique [Palaiseau] (LIX), École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS), École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'Informatique Gaspard-Monge (LIGM), École des Ponts ParisTech (ENPC)-Centre National de la Recherche Scientifique (CNRS)-Université Gustave Eiffel
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Zdroj: WABI 2022-22nd Workshop on Algorithms in Bioinformatics
WABI 2022-22nd Workshop on Algorithms in Bioinformatics, Sep 2022, Potsdam, Germany
DOI: 10.4230/lipics.wabi.2022.7
Popis: Despite being a textbook application of dynamic programming (DP) and routine task in RNA structure analysis, RNA secondary structure prediction remains challenging whenever pseudoknots come into play. To circumvent the NP-hardness of energy minimization in realistic energy models, specialized algorithms have been proposed for restricted conformation classes that capture the most frequently observed configurations. While these methods rely on hand-crafted DP schemes, we generalize and fully automatize the design of DP pseudoknot prediction algorithms. We formalize the problem of designing DP algorithms for an (infinite) class of conformations, modeled by (a finite number of) fatgraphs, and automatically build DP schemes minimizing their algorithmic complexity. We propose an algorithm for the problem, based on the tree-decomposition of a well-chosen representative structure, which we simplify and reinterpret as a DP scheme. The algorithm is fixed-parameter tractable for the tree-width tw of the fatgraph, and its output represents a 𝒪(n^{tw+1}) algorithm for predicting the MFE folding of an RNA of length n. Our general framework supports general energy models, partition function computations, recursive substructures and partial folding, and could pave the way for algebraic dynamic programming beyond the context-free case.
LIPIcs, Vol. 242, 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022), pages 7:1-7:24
Databáze: OpenAIRE