Multi-modal Associative Storage and Retrieval Using Hopfield Auto-associative Memory Network

Autor:	Prasun Joshi, Vandana M. Ladwani, V. Ramasubramanian, Rachna Shriwas
Rok vydání:	2019
Předmět:	Computer science business.industry Pattern recognition 02 engineering and technology Content-addressable memory Autoassociative memory Hopfield network 03 medical and health sciences 0302 clinical medicine Modal Robustness (computer science) Learning rule 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence business 030217 neurology & neurosurgery
Zdroj:	Artificial Neural Networks and Machine Learning – ICANN 2019: Theoretical Neural Computation ISBN: 9783030304867 ICANN (1)
DOI:	10.1007/978-3-030-30487-4_5
Popis:	Recently we presented text storage and retrieval in an auto-associative memory framework using the Hopfield neural-network. This realized the ideal functionality of Hopfield network as a content-addressable information retrieval system. In this paper, we extend this result to multi-modal patterns, namely, images with text captions and show that the Hopfield network indeed can store and retrieve such multi-modal patterns even in an auto-associative setting. Within this framework, we examine two central issues such as (i) performance characterization to show that the O(N) capacity of the Hopfield network for a network of size N neurons under the Pseudo-inverse learning rule is still retained in the multi-modal case, and (ii) the retrieval dynamics of the multi-modal pattern (i.e., image and caption together) under various types of queries such as image\(+\)caption, image only and caption only, in line with a typical multi-modal retrieval system where the entire multi-modal pattern is expected to be retrieved even with a partial query pattern from any of the modalities. We present results related to these two issues on a large database of 7000\(+\) captioned-images and establish the practical scalability of both the storage capacity and the retrieval robustness of the Hopfield network for content-addressable retrieval of multi-modal patterns. We point to the potential of this work to extend to a more wider definition of multi-modality as in multi-media content, with various modalities such as video (image sequence) synchronized with sub-title text, speech, music and non-speech.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::1f36548bc7ccc426496b82f64404770f https://doi.org/10.1007/978-3-030-30487-4_5 Zobrazit plný text záznamu