Author Name Disambiguation Using Graph Node Embedding Method
Autor: | Zhongmin Yan, Wenjing Zhang, Yongqing Zheng |
---|---|
Rok vydání: | 2019 |
Předmět: |
050101 languages & linguistics
Information retrieval Artificial neural network Computer science 05 social sciences 02 engineering and technology Partition (database) Graph Email address 0202 electrical engineering electronic engineering information engineering Embedding Graph (abstract data type) 020201 artificial intelligence & image processing 0501 psychology and cognitive sciences Feature learning |
Zdroj: | CSCWD |
DOI: | 10.1109/cscwd.2019.8791898 |
Popis: | In real-world, name ambiguity mainly arises when many people share the same name or express their names in the same way, which often causes erroneous aggregation of records of multiple persons with the same name. This name ambiguity problem deteriorates the performance of information retrieval in digital libraries, web search etc. It is nontrivial to distinguish those name references, especially when there is very limited information about them. Most existing studies uses features like email address, frequent words etc. However, the information is not always available because of privacy or too expensive to get. In this paper, we utilize a graph node embedding approach to solve author name disambiguation problem, where a graph is constructed only using the collaborator relationships. In the methodological aspect, the proposed method uses random walk and a graph node representation learning method to embed each node into a low dimensional vector space. Finally, we solve this problem by partitioning the records associated with a name reference such that each partition contains records pertaining to a unique real-world person. We evaluate our method on the real world CiteSeerX dataset, and the experimental results demonstrate that the proposed method is significantly better than most of the existing name disambiguation methods working in a similar setting. |
Databáze: | OpenAIRE |
Externí odkaz: |