Be Different to Be Better!

Autor:	Pezzelle, S., Greco, C., Gandolfi, G., Gualdoni, E., Bernardi, R., Cohn, T., He, Y., Liu, Y.
Přispěvatelé:	Language and Computation (ILLC, FNWI/FGw), ILLC (FNWI)
Jazyk:	angličtina
Rok vydání:	2020
Předmět:	business.industry Computer science 02 engineering and technology 010501 environmental sciences Machine learning computer.software_genre 01 natural sciences Complementarity (physics) 0202 electrical engineering electronic engineering information engineering Leverage (statistics) 020201 artificial intelligence & image processing Artificial intelligence business computer 0105 earth and related environmental sciences
Zdroj:	Findings of the Association for Computational Linguistics : Findings of ACL: EMNLP 2020: 16-20 November, 2020, 2751-2767 STARTPAGE=2751;ENDPAGE=2767;TITLE=Findings of the Association for Computational Linguistics : Findings of ACL: EMNLP 2020 Findings of the Association for Computational Linguistics: EMNLP 2020 EMNLP (Findings)
Popis:	This paper introduces BD2BB, a novel language and vision benchmark that requires multimodal models combine complementary information from the two modalities. Recently, impressive progress has been made to develop universal multimodal encoders suitable for virtually any language and vision tasks. However, current approaches often require them to combine redundant information provided by language and vision. Inspired by real-life communicative contexts, we propose a novel task where either modality is necessary but not sufficient to make a correct prediction. To do so, we first build a dataset of images and corresponding sentences provided by human participants. Second, we evaluate state-of-the-art models and compare their performance against human speakers. We show that, while the taskis relatively easy for humans, best-performing models struggle to achieve similar results.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::849395fbcc81a7abaaa24425f2265fea https://doi.org/10.18653/v1/2020.findings-emnlp.248 Zobrazit plný text záznamu