Extracting named entities from Russian-language documents with different expressiveness of structure

Autor: Maria D. Averina, Olga A. Levanova
Jazyk: English<br />Russian
Rok vydání: 2023
Předmět:
Zdroj: Моделирование и анализ информационных систем, Vol 30, Iss 4, Pp 382-393 (2023)
Druh dokumentu: article
ISSN: 1818-1015
2313-5417
DOI: 10.18255/1818-1015-2023-4-382-393
Popis: This work is devoted to solving the problem of recognizing named entities for Russian-language texts based on the CRF model. Two sets of data were considered: documents on refinancing with a good document structure, semi-structured texts of court records. The model was tested under various sets of text features and CRF parameters (optimization algorithms). In average for all entities, the best F-measure value for structured documents was 0.99, and for semi-structured ones 0.86.
Databáze: Directory of Open Access Journals