On Investigating Both Effectiveness and Efficiency of Embedding Methods in Task of Similarity Computation of Nodes in Graphs

Autor:	Sang-Wook Kim, Masoud Reyhani Hamedani
Jazyk:	angličtina
Rok vydání:	2020
Předmět:	Theoretical computer science Graph embedding Computer science feature representation learning 02 engineering and technology Similarity measure lcsh:Technology lcsh:Chemistry Similarity (network science) node–pairs similarity 020204 information systems Node (computer science) 0202 electrical engineering electronic engineering information engineering General Materials Science Instrumentation lcsh:QH301-705.5 Fluid Flow and Transfer Processes graph embedding lcsh:T Process Chemistry and Technology General Engineering Link (geometry) lcsh:QC1-999 Computer Science Applications Task (computing) Range (mathematics) lcsh:Biology (General) lcsh:QD1-999 lcsh:TA1-2040 Embedding link-based similarity measures 020201 artificial intelligence & image processing lcsh:Engineering (General). Civil engineering (General) lcsh:Physics MathematicsofComputing_DISCRETEMATHEMATICS
Zdroj:	Applied Sciences Volume 11 Issue 1 Applied Sciences, Vol 11, Iss 162, p 162 (2021)
ISSN:	2076-3417
DOI:	10.3390/app11010162
Popis:	One of the important tasks in a graph is to compute the similarity between two nodes link-based similarity measures (in short, similarity measures) are well-known and conventional techniques for this task that exploit the relations between nodes (i.e., links) in the graph. Graph embedding methods (in short, embedding methods) convert nodes in a graph into vectors in a low-dimensional space by preserving social relations among nodes in the original graph. Instead of applying a similarity measure to the graph to compute the similarity between nodes a and b, we can consider the proximity between corresponding vectors of a and b obtained by an embedding method as the similarity between a and b. Although embedding methods have been analyzed in a wide range of machine learning tasks such as link prediction and node classification, they are not investigated in terms of similarity computation of nodes. In this paper, we investigate both effectiveness and efficiency of embedding methods in the task of similarity computation of nodes by comparing them with those of similarity measures. To the best of our knowledge, this is the first work that examines the application of embedding methods in this special task. Based on the results of our extensive experiments with five well-known and publicly available datasets, we found the following observations for embedding methods: (1) with all datasets, they show less effectiveness than similarity measures except for one dataset, (2) they underperform similarity measures with all datasets in terms of efficiency except for one dataset, (3) they have more parameters than similarity measures, thereby leading to a time-consuming parameter tuning process, (4) increasing the number of dimensions does not necessarily improve their effectiveness in computing the similarity of nodes.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::2f139ffb0e9a55c9a3d5e6c70fa24ce8 Zobrazit plný text záznamu