Zobrazeno 1 - 10
of 46
pro vyhledávání: '"KEMENTCHEDJHIEVA, YOVA"'
Text-to-image retrieval is a critical task for managing diverse visual content, but common benchmarks for the task rely on small, single-domain datasets that fail to capture real-world complexity. Pre-trained vision-language models tend to perform we
Externí odkaz:
http://arxiv.org/abs/2412.00139
Autor:
Romero, David, Lyu, Chenyang, Wibowo, Haryo Akbarianto, Lynn, Teresa, Hamed, Injy, Kishore, Aditya Nanda, Mandal, Aishik, Dragonetti, Alina, Abzaliev, Artem, Tonja, Atnafu Lambebo, Balcha, Bontu Fufa, Whitehouse, Chenxi, Salamea, Christian, Velasco, Dan John, Adelani, David Ifeoluwa, Meur, David Le, Villa-Cueva, Emilio, Koto, Fajri, Farooqui, Fauzan, Belcavello, Frederico, Batnasan, Ganzorig, Vallejo, Gisela, Caulfield, Grainne, Ivetta, Guido, Song, Haiyue, Ademtew, Henok Biadglign, Maina, Hernán, Lovenia, Holy, Azime, Israel Abebe, Cruz, Jan Christian Blaise, Gala, Jay, Geng, Jiahui, Ortiz-Barajas, Jesus-German, Baek, Jinheon, Dunstan, Jocelyn, Alemany, Laura Alonso, Nagasinghe, Kumaranage Ravindu Yasas, Benotti, Luciana, D'Haro, Luis Fernando, Viridiano, Marcelo, Estecha-Garitagoitia, Marcos, Cabrera, Maria Camila Buitrago, Rodríguez-Cantelar, Mario, Jouitteau, Mélanie, Mihaylov, Mihail, Imam, Mohamed Fazli Mohamed, Adilazuarda, Muhammad Farid, Gochoo, Munkhjargal, Otgonbold, Munkh-Erdene, Etori, Naome, Niyomugisha, Olivier, Silva, Paula Mónica, Chitale, Pranjal, Dabre, Raj, Chevi, Rendi, Zhang, Ruochen, Diandaru, Ryandito, Cahyawijaya, Samuel, Góngora, Santiago, Jeong, Soyeong, Purkayastha, Sukannya, Kuribayashi, Tatsuki, Clifford, Teresa, Jayakumar, Thanmay, Torrent, Tiago Timponi, Ehsan, Toqeer, Araujo, Vladimir, Kementchedjhieva, Yova, Burzo, Zara, Lim, Zheng Wei, Yong, Zheng Xin, Ignat, Oana, Nwatu, Joan, Mihalcea, Rada, Solorio, Thamar, Aji, Alham Fikri
Visual Question Answering (VQA) is an important task in multimodal AI, and it is often used to test the ability of vision-language models to understand and reason on knowledge present in both visual and textual data. However, most of the current VQA
Externí odkaz:
http://arxiv.org/abs/2406.05967
Autor:
Fierro, Constanza, Garneau, Nicolas, Bugliarello, Emanuele, Kementchedjhieva, Yova, Søgaard, Anders
Facts are subject to contingencies and can be true or false in different circumstances. One such contingency is time, wherein some facts mutate over a given period, e.g., the president of a country or the winner of a championship. Trustworthy languag
Externí odkaz:
http://arxiv.org/abs/2404.03036
Multimodal large language models (MLLMs) carry the potential to support humans in processing vast amounts of information. While MLLMs are already being used as a fact-checking tool, their abilities and limitations in this regard are understudied. Her
Externí odkaz:
http://arxiv.org/abs/2403.03627
The performance of Open-Domain Question Answering (ODQA) retrieval systems can exhibit sub-optimal behavior, providing text excerpts with varying degrees of irrelevance. Unfortunately, many existing ODQA datasets lack examples specifically targeting
Externí odkaz:
http://arxiv.org/abs/2403.01461
Autor:
Cao, Yong, Kementchedjhieva, Yova, Cui, Ruixiang, Karamolegkou, Antonia, Zhou, Li, Dare, Megan, Donatelli, Lucia, Hershcovich, Daniel
Building upon the considerable advances in Large Language Models (LLMs), we are now equipped to address more sophisticated tasks demanding a nuanced understanding of cross-cultural contexts. A key example is recipe adaptation, which goes beyond simpl
Externí odkaz:
http://arxiv.org/abs/2310.17353
Autor:
Li, Jiaang, Karamolegkou, Antonia, Kementchedjhieva, Yova, Abdou, Mostafa, Lehmann, Sune, Søgaard, Anders
Large language models (LLMs) have complicated internal dynamics, but induce representations of words and phrases whose geometry we can study. Human language processing is also opaque, but neural response measurements can provide (noisy) recordings of
Externí odkaz:
http://arxiv.org/abs/2306.01930
Multi-label text classification (MLC) is a challenging task in settings of large label sets, where label support follows a Zipfian distribution. In this paper, we address this problem through retrieval augmentation, aiming to improve the sample effic
Externí odkaz:
http://arxiv.org/abs/2305.13058
Standard methods for multi-label text classification largely rely on encoder-only pre-trained language models, whereas encoder-decoder models have proven more effective in other classification tasks. In this study, we compare four methods for multi-l
Externí odkaz:
http://arxiv.org/abs/2305.05627
Large-scale pretrained language models (LMs) are said to ``lack the ability to connect utterances to the world'' (Bender and Koller, 2020), because they do not have ``mental models of the world' '(Mitchell and Krakauer, 2023). If so, one would expect
Externí odkaz:
http://arxiv.org/abs/2302.06555