Random words in free groups, non-crossing matchings and RNA secondary structures

Autor: Gadgil, Siddhartha, Krishnapur, Manjunath
Rok vydání: 2020
Předmět:
Druh dokumentu: Working Paper
Popis: Consider a random word $X^n=(X_1,\ldots ,X_n)$ in an alphabet consisting of $4$ letters, with the letters viewed either as $A$, $U$, $G$ and $C$ (i.e., nucleotides in an RNA sequence) or $\alpha$, $\bar{\alpha}$, $\beta$ and $\bar{\beta}$ (i.e., generators of the free group $\langle\alpha,\beta\rangle$ and their inverses). We show that the expected fraction $\rho(n)$ of unpaired bases in an optimal RNA secondary structure (with only Watson-Crick bonds and no pseudo-knots) converges to a constant $\lambda_2$ with $0<\lambda_2<1$ as $n\to\infty$. Thus, a positive proportion of the bases of a random RNA string do not form hydrogen bonds. We do not know the exact value of $\lambda_2$, but we derive upper and lower bounds for it. In terms of free groups, $\rho(n)$ is the ratio of the length of the shortest word representing $X$ in the generating set consisting of conjugates of generators and their inverses to the word length of $X$ with respect to the standard generators and their inverses. Thus for a typical word the word length in the (infinite) generating set consisting of the conjugates of standard generators grows linearly with the word length in the standard generators. In fact, we show that a similar result holds for all non-abelian finitely generated free groups $\langle\alpha_1,\dots,\alpha_k\rangle$, $k\geq 2$.
Comment: new results giving stationary distribution of a Markov chain from a greedy algorithm and corresponding explicit bounds; 19 pages
Databáze: arXiv