You can't suggest that?! Comparisons and improvements of speller error models.

Autor: Kaalep, Heiki-Jaan, Pirinen, Flammie, Moshagen, Sjur Nørstebø
Předmět:
Zdroj: Nordlyd; 2022, Vol. 46 Issue 1, p125-139, 15p
Abstrakt: In this article, we study correction of spelling errors, specifically on how the spelling errors are made and how can we model them computationally in order to fix them. The article describes two different approaches to generating spelling correction suggestions for three Uralic languages: Estonian, North Sámi and South Sámi. The first approach of modelling spelling errors is rule-based, where experts write rules that describe the kind of errors that are made, and these are compiled into a finite-state automaton that models the errors. The second is data driven, where we show a machine learning algorithm a list of errors that humans have made, and it creates a neural network that can model the errors. Both approaches require collections of misspelling lists and understanding its contents; therefore, we also describe the actual errors we have seen in detail. We find that while both approaches create error correction systems, with current resources the expert-built systems are still more reliable. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index