Improving convolutional neural network for text classification by recursive data pruning

Autor:	Edmond Y.M. Lo, Qi Li, Pengfei Li, Kezhi Mao
Rok vydání:	2020
Předmět:	0209 industrial biotechnology Artificial neural network Computer science business.industry Cognitive Neuroscience Pooling 02 engineering and technology Filter (signal processing) Machine learning computer.software_genre Convolutional neural network Computer Science Applications 020901 industrial engineering & automation Discriminative model Artificial Intelligence 0202 electrical engineering electronic engineering information engineering Benchmark (computing) Feature (machine learning) 020201 artificial intelligence & image processing Pruning (decision trees) Artificial intelligence business computer
Zdroj:	Neurocomputing. 414:143-152
ISSN:	0925-2312
DOI:	10.1016/j.neucom.2020.07.049
Popis:	In spite of the state-of-the-art performance of deep neural networks, shallow neural networks are still the choice in applications with limited computing and memory resources. Convolutional neural network (CNN), in particular the one-convolutional-layer CNN, is a widely-used shallow neural network in natural language processing tasks such as text classification. However, it was found that CNNs may misfit to task-irrelevant words in dataset, which in turn leads to unsatisfactory performance. To alleviate this problem, attention mechanism can be integrated into CNN, but this takes up the limited resources. In this paper, we propose to address the misfitting problem from a novel angle - pruning task-irrelevant words from the dataset. The proposed method evaluates the performance of each convolutional filter based on its discriminative power of the feature generated at the pooling layer, and prunes words captured by the poorly-performed filters. Experiment results show that our proposed model significantly outperforms the CNN baseline model. Moreover, our proposed model produces performance similar to or better than the benchmark models (attention integrated CNNs) while demanding less parameters and FLOPs, and is therefore a choice model for resource limited scenarios, such as mobile applications.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::fedad8e0aa1eefb98e1c0f94fe2ddbf0 https://doi.org/10.1016/j.neucom.2020.07.049 Zobrazit plný text záznamu Full Text from ScienceDirect