HGAT: Heterogeneous Graph Attention Networks for Semi-supervised Short Text Classification

Autor:	Liqiang Nie, Tianchi Yang, Chuan Shi, Xiaoli Li, Houye Ji, Linmei Hu
Rok vydání:	2021
Předmět:	Information propagation Computer science Graph neural networks business.industry 02 engineering and technology Semi-supervised learning Machine learning computer.software_genre General Business Management and Accounting Computer Science Applications ComputingMethodologies_PATTERNRECOGNITION 020204 information systems 0202 electrical engineering electronic engineering information engineering Benchmark (computing) Graph (abstract data type) Labeled data 020201 artificial intelligence & image processing Heterogeneous information Artificial intelligence business computer Information Systems
Zdroj:	ACM Transactions on Information Systems. 39:1-29
ISSN:	1558-2868 1046-8188
DOI:	10.1145/3450352
Popis:	Short text classification has been widely explored in news tagging to provide more efficient search strategies and more effective search results for information retrieval. However, most existing studies, concentrating on long text classification, deliver unsatisfactory performance on short texts due to the sparsity issue and the insufficiency of labeled data. In this article, we propose a novel heterogeneous graph neural network-based method for semi-supervised short text classification, leveraging full advantage of limited labeled data and large unlabeled data through information propagation along the graph. Specifically, we first present a flexible heterogeneous information network (HIN) framework for modeling short texts, which can integrate any type of additional information and meanwhile capture their relations to address the semantic sparsity. Then, we propose Heterogeneous Graph Attention networks (HGAT) to embed the HIN for short text classification based on a dual-level attention mechanism, including node-level and type-level attentions. To efficiently classify new coming texts that do not previously exist in the HIN, we extend our model HGAT for inductive learning, avoiding re-training the model on the evolving HIN. Extensive experiments on single-/multi-label classification demonstrates that our proposed model HGAT significantly outperforms state-of-the-art methods across the benchmark datasets under both transductive and inductive learning.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::73ef2fa32a60418d952aa62682edb176 https://doi.org/10.1145/3450352 Zobrazit plný text záznamu