Robust Word Vectors: Context-Informed Embeddings for Noisy Texts.

Autor: Malykh, V., Khakhulin, T., Logacheva, V.
Předmět:
Zdroj: Journal of Mathematical Sciences; Jul2023, Vol. 273 Issue 4, p614-627, 14p
Abstrakt: We suggest a new language-independent architecture of robust word vectors (RoVe). It is designed to alleviate the issue of typos and misspellings, common in almost any user-generated content, which hinder automatic text processing. Our model is morphologically motivated, which allows it to deal with unseen word forms in morphologically rich languages. We present the results on a number of natural language processing (NLP) tasks and languages for a variety of related architectures and show that the proposed architecture is robust to typos. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index