Computer Vision System for Expressing Texture Using Sound-Symbolic Words
Autor: | Jinhwan Kwon, Takuya Kawashima, Koichi Yamagata, Maki Sakamoto, Wataru Shimoda |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Structure (mathematical logic)
business.industry Process (computing) ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION Texture (music) Convolutional neural network Field (computer science) onomatopoeia BF1-990 tactile sensation Gesture recognition image databases Visual information processing Onomatopoeia Psychology Computer vision Artificial intelligence business texture General Psychology sound-symbolic words ComputingMethodologies_COMPUTERGRAPHICS Original Research |
Zdroj: | Frontiers in Psychology, Vol 12 (2021) Frontiers in Psychology |
ISSN: | 1664-1078 |
DOI: | 10.3389/fpsyg.2021.654779/full |
Popis: | The major goals of texture research in computer vision are to understand, model, and process texture and ultimately simulate human visual information processing using computer technologies. The field of computer vision has witnessed remarkable advancements in material recognition using deep convolutional neural networks (DCNNs), which have enabled various computer vision applications, such as self-driving cars, facial and gesture recognition, and automatic number plate recognition. However, for computer vision to “express” texture like human beings is still difficult because texture recognition has no correct or incorrect answer and is ambiguous. In this paper, we develop a computer vision method using DCNN that expresses the texture of materials using texture terms. To achieve this goal, we use Japanese texture terms that are “sound-symbolic” words, which can describe differences in texture sensation at a fine resolution and are known to have strong and systematic sensory-sound associations. Because the phonemes of Japanese sound-symbolic words characterize categories of texture sensations, we develop a computer vision method to generate the phonemes and structure comprising sound-symbolic words that probabilistically correspond to the input images. It was confirmed that the sound-symbolic words output by our system had about 80% accuracy rate in our evaluation. |
Databáze: | OpenAIRE |
Externí odkaz: |