Data and Process Quality Evaluation in a Textual Big Data Archiving System

Autor: Mariagrazia Fugini, Jacopo Finocchi
Rok vydání: 2022
Předmět:
Zdroj: Journal on Computing and Cultural Heritage. 15:1-19
ISSN: 1556-4711
1556-4673
DOI: 10.1145/3461015
Popis: The article presents a textual Big Data analytics solution developed in a real setting as a part of a high-capacity document digitization and storage system. A software based on machine learning techniques performs automated extraction and processing of textual contents. The work focuses on performance and data confidence evaluation and describes the approach to computing a set of indicators for textual data quality. It then presents experimental results.
Databáze: OpenAIRE