Learning of Multimodal Representations With Random Walks on the Click Graph

Autor: Jun Song, Shuicheng Yan, Fei Wu, Yueting Zhuang, Yong Rui, Xinyan Lu, Zhongfei Mark Zhang
Rok vydání: 2015
Předmět:
Zdroj: IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. 25(2)
ISSN: 1941-0042
Popis: In multimedia information retrieval, most classic approaches tend to represent different modalities of media in the same feature space. With the click data collected from the users’ searching behavior, existing approaches take either one-to-one paired data (text–image pairs) or ranking examples (text–query–image and/or image–query–text ranking lists) as training examples, which do not make full use of the click data, particularly the implicit connections among the data objects. In this paper, we treat the click data as a large click graph, in which vertices are images/text queries and edges indicate the clicks between an image and a query. We consider learning a multimodal representation from the perspective of encoding the explicit/implicit relevance relationship between the vertices in the click graph. By minimizing both the truncated random walk loss as well as the distance between the learned representation of vertices and their corresponding deep neural network output, the proposed model which is named multimodal random walk neural network (MRW-NN) can be applied to not only learn robust representation of the existing multimodal data in the click graph, but also deal with the unseen queries and images to support cross-modal retrieval. We evaluate the latent representation learned by MRW-NN on a public large-scale click log data set Clickture and further show that MRW-NN achieves much better cross-modal retrieval performance on the unseen queries/images than the other state-of-the-art methods.
Databáze: OpenAIRE