Cluster-Gated Convolutional Neural Network for Short Text Classification
Autor: | Ziqi Lin, Haidong Zhang, Wancheng Ni, Meijing Zhao |
---|---|
Rok vydání: | 2019 |
Předmět: |
0209 industrial biotechnology
Fuzzy clustering Computer science business.industry Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing) Context (language use) 02 engineering and technology Machine learning computer.software_genre Convolutional neural network Range (mathematics) ComputingMethodologies_PATTERNRECOGNITION 020901 industrial engineering & automation 0202 electrical engineering electronic engineering information engineering Feature (machine learning) 020201 artificial intelligence & image processing Artificial intelligence Cluster analysis business computer Word (computer architecture) Natural language |
Zdroj: | CoNLL |
Popis: | Text classification plays a crucial role for understanding natural language in a wide range of applications. Most existing approaches mainly focus on long text classification (e.g., blogs, documents, paragraphs). However, they cannot easily be applied to short text because of its sparsity and lack of context. In this paper, we propose a new model called cluster-gated convolutional neural network (CGCNN), which jointly explores word-level clustering and text classification in an end-to-end manner. Specifically, the proposed model firstly uses a bi-directional long short-term memory to learn word representations. Then, it leverages a soft clustering method to explore their semantic relation with the cluster centers, and takes linear transformation on text representations. It develops a cluster-dependent gated convolutional layer to further control the cluster-dependent feature flows. Experimental results on five commonly used datasets show that our model outperforms state-of-the-art models. |
Databáze: | OpenAIRE |
Externí odkaz: |