GenPR: Generative PageRank Framework for Semi-supervised Learning on Citation Graphs
Autor: | Mikhail Kamalov, Konstantin Avrachenkov |
---|---|
Přispěvatelé: | Network Engineering and Operations (NEO ), Inria Sophia Antipolis - Méditerranée (CRISAM), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Université Côte d'Azur (UCA) |
Jazyk: | angličtina |
Rok vydání: | 2020 |
Předmět: |
Computer Science::Machine Learning
Theoretical computer science Artificial neural network Computer science 02 engineering and technology Semi-supervised learning Computer Science::Digital Libraries [INFO.INFO-SI]Computer Science [cs]/Social and Information Networks [cs.SI] law.invention Generative model PageRank [STAT.ML]Statistics [stat]/Machine Learning [stat.ML] law 020204 information systems [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR] 0202 electrical engineering electronic engineering information engineering Citation graph Embedding 020201 artificial intelligence & image processing Adjacency matrix Interpretability |
Zdroj: | INL 2020-9th Conference on Artificial Intelligence and Natural Language INL 2020-9th Conference on Artificial Intelligence and Natural Language, Oct 2020, Helsinki, Finland. pp.158-165, ⟨10.1007/978-3-030-59082-6_12⟩ Communications in Computer and Information Science ISBN: 9783030590819 |
DOI: | 10.1007/978-3-030-59082-6_12⟩ |
Popis: | International audience; Nowadays, Semi-Supervised Learning (SSL) on citation graph data sets is a rapidly growing area of research. However, the recently proposed graph-based SSL algorithms use a default adjacency matrix with binary weights on edges (citations), that causes a loss of the nodes (papers) similarity information. In this work, therefore, we propose a framework focused on embedding PageRank SSL in a generative model. This framework allows one to do joint training of nodes latent space representation and label spreading through the reweighted adjacency matrix by node similarities in the latent space. We explain that a generative model can improve accuracy and reduce the number of iteration steps for PageRank SSL. Moreover, we show that our framework outperforms the best graph-based SSL algorithms on four public citation graph data sets and improves the interpretability of classification results. |
Databáze: | OpenAIRE |
Externí odkaz: |