Estimating the imageability of words by mining visual characteristics from crawled image data

Autor:	Daisuke Deguchi, Takatsugu Hirayama, Yasutomo Kawanishi, Hiroshi Murase, Marc A. Kastner, Frank Nack, Ichiro Ide
Přispěvatelé:	Algorithmic Data Science (IVI, FNWI)
Rok vydání:	2020
Předmět:	Computer Networks and Communications Computer science business.industry media_common.quotation_subject ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION 020207 software engineering 02 engineering and technology computer.software_genre Variety (linguistics) Semantics Hardware and Architecture Perception 0202 electrical engineering electronic engineering information engineering Media Technology Key (cryptography) Artificial intelligence Set (psychology) business computer Software Natural language processing Semantic gap media_common
Zdroj:	Multimedia Tools and Applications, 79(25-26). Springer Netherlands
ISSN:	1573-7721 1380-7501
DOI:	10.1007/s11042-019-08571-4
Popis:	Natural Language Processing and multi-modal analyses are key elements in many applications. However, the semantic gap is an everlasting problem, leading to unnatural results disconnected from the user’s perception. To understand semantics in multimedia applications, human perception needs to be taken into consideration. Imageability is an approach originating from Pyscholinguistics to quantize the human perception of words. Research shows a relationship between language usage and the imageability of words, making it useful for multimodal applications. However, the creation of imageability datasets is often manual and labor-intensive. In this paper, we propose a method using image data mining of a variety of visual features to estimate the imageability of words. The main assumption is a relationship between the imageability of concepts, human perception, and the contents of Web-crawled images. Using a set of low- and high-level visual features from Web-crawled images, a model is trained to predict imageability. The evaluations show that the imageability can be predicted with both a sufficiently low error, and a high correlation to the ground-truth annotations. The proposed method can be used to increase the corpus of imageability dictionaries.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::4e6768b449687df60aa843b670ca3c50 https://doi.org/10.1007/s11042-019-08571-4 Zobrazit plný text záznamu Full text from SpringerLink