Matrices inducing generalized metric on sequences

Autor: Eloi Araujo, Fábio V. Martinez, Carlos H.A. Higa, José Soares
Rok vydání: 2023
Předmět:
Zdroj: Discrete Applied Mathematics. 332:135-154
ISSN: 0166-218X
DOI: 10.1016/j.dam.2023.02.011
Popis: Sequence comparison is a basic task to capture similarities and differences between two or more sequences of symbols, with countless applications such as in computational biology. An alignment is a way to compare sequences, where a giving scoring function determines the degree of similarity between them. Many scoring functions are obtained from scoring matrices. However,not all scoring matrices induce scoring functions which are distances, since the scoring function is not necessarily a metric. In this work we establish necessary and sufficient conditions for scoring matrices to induce each one of the properties of a metric in weighted edit distances. For a subset of scoring matrices that induce normalized edit distances, we also characterize each class of scoring matrices inducing normalized edit distances. Furthermore, we define an extended edit distance, which takes into account a set of editing operations that transforms one sequence into another regardless of the existence of a usual corresponding alignment to represent them, describing a criterion to find a sequence of edit operations whose weight is minimum. Similarly, we determine the class of scoring matrices that induces extended edit distances for each of the properties of a metric.
Comment: 40 pages, 2 figures
Databáze: OpenAIRE