Quick and effective approximation of in silico saturation mutagenesis experiments with first-order taylor expansion

Autor: Alexander Sasse, Maria Chikina, Sara Mostafavi
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: iScience, Vol 27, Iss 9, Pp 110807- (2024)
Druh dokumentu: article
ISSN: 2589-0042
DOI: 10.1016/j.isci.2024.110807
Popis: Summary: To understand the decision process of genomic sequence-to-function models, explainable AI algorithms determine the importance of each nucleotide in a given input sequence to the model’s predictions and enable discovery of cis-regulatory motifs for gene regulation. The most commonly applied method is in silico saturation mutagenesis (ISM) because its per-nucleotide importance scores can be intuitively understood as the computational counterpart to in vivo saturation mutagenesis experiments. While ISM is highly interpretable, it is computationally challenging to perform for many sequences, and becomes prohibitive as the length of the input sequences and size of the model grows. Here, we use the first-order Taylor approximation to approximate ISM values from the model’s gradient, which reduces its computation cost to a single forward pass for an input sequence. We show that the Taylor ISM (TISM) approximation is robust across different model ablations, random initializations, training parameters, and dataset sizes.
Databáze: Directory of Open Access Journals