The Text Fragment Extraction Module of the Hybrid Intelligent Information System for Analysis of Judicial Practice of Arbitration Courts

Autor: Maria O. Taran, Georgiy I. Revunkov, Yuriy E. Gapanyuk
Rok vydání: 2020
Předmět:
Zdroj: Advances in Neural Computation, Machine Learning, and Cognitive Research IV ISBN: 9783030605766
DOI: 10.1007/978-3-030-60577-3_28
Popis: The architecture of a hybrid intelligent information system for the analysis of the judicial practice of arbitration courts is discussed. The structure of the subsystems of consciousness and subconsciousness in the architecture of the proposed system is considered in detail. The text fragments extraction module plays a crucial role in the subconsciousness subsystem of the proposed system. The principles of operation of the text fragment extraction module are examined in detail. The architecture of a deep neural network, which is the basis of the module, is proposed. The aspects of the training of the proposed deep neural network are considered. Variants of text vectorization based on the tf-idf and fasttext approaches are investigated; vectorized texts are input data for the proposed neural network. Experiments were conducted to determine the quality metrics for the proposed vectorization options. The experimental results show that the vectorization option based on tf-idf is superior to the combined vectorization option based on tf-idf and fasttext. The developed text fragments extraction module makes it possible to implement the proposed system successfully.
Databáze: OpenAIRE