Large-Scale Text Classification Using Scope-Based Convolutional Neural Network: A Deep Learning Approach

Autor: Jiaying Wang, Yaxin Li, Jing Shan, Jinling Bao, Chuanyu Zong, Liang Zhao
Jazyk: angličtina
Rok vydání: 2019
Předmět:
Zdroj: IEEE Access, Vol 7, Pp 171548-171558 (2019)
Druh dokumentu: article
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2019.2955924
Popis: Text classification is one of the most important and typical tasks in Natural Language Processing (NLP) which can be applied for many applications. Recently, deep learning approaches has shown their advantages in solving text classification problem, in which Convolutional Neural Network (CNN) is one of the most successful model in the field. In this paper, we propose a novel deep learning approach for categorizing text documents by using scope-based convolutional neural network. Different from window-based CNN, scope does not require the words that construct a local feature have to be contiguous. It can represent deeper local information of text data. We propose a large-scale scope-based convolutional neural network (LSS-CNN), which is based on scope convolution, aggregation optimization, and max pooling operation. Based on these techniques, we can gradually extract the most valuable local information of the text document. This paper also discusses how to effectively calculate the scope-based information and parallel training for large-scale datasets. Extensive experiments have been conducted on real datasets to compare our model with several state-of-the-art approaches. The experimental results show that LSS-CNN can achieve both effectiveness and good scalability on big text data.
Databáze: Directory of Open Access Journals