A Tensor-Based Markov Chain Model for Heterogeneous Information Network Collective Classification

Autor: Michael K. Ng, Qingyao Wu, Mingkui Tan, Chao Han, Jian Chen
Rok vydání: 2022
Předmět:
Zdroj: IEEE Transactions on Knowledge and Data Engineering. 34:4063-4076
ISSN: 2326-3865
1041-4347
Popis: Heterogeneous Information Network(HIN) collecitve classification studies the problem of predicting labels for one type of nodes in a HIN which contains multiple types of nodes multiple types of links among them. Previous studies have revealed that exploiting relative importance of links is quite useful to improve node classification performance as connected nodes tend to have similar labels. Most existing approaches exploit the relative importance of links either by directly counting the number of connections among nodes or by learning the weight of each type of link from labeled data only. However, these approaches either neglect the importance of types of links to the class labels or may lead to overfitting problem. We propose a Tensor-based Markov chain (T-Mark) approach, which is able to automatically and simultaneously predict the labels for unlabeled nodes and give the relative importance of types of links that actually improve the classification accuracy. Specifically, we build two tensor equations by using the HIN and features of nodes from both labeled and unlabeled data. A Markov chain-based model is proposed and it is solved by an iterative process to obtain the stationary distributions. Theoretical analyses of the existence and uniqueness of such probability distributions are given. Extensive experimental results demonstrate that T-Mark is able to achieve superior performance in the comparison and obtain reasonable relative importance of links.
Databáze: OpenAIRE