Zobrazeno 1 - 10
of 90
pro vyhledávání: '"Yogatama, Dani"'
Modern language models can process inputs across diverse languages and modalities. We hypothesize that models acquire this capability through learning a shared representation space across heterogeneous data types (e.g., different languages and modali
Externí odkaz:
http://arxiv.org/abs/2411.04986
While interpretability research has shed light on some internal algorithms utilized by transformer-based LLMs, reasoning in natural language, with its deep contextuality and ambiguity, defies easy categorization. As a result, formulating clear and mo
Externí odkaz:
http://arxiv.org/abs/2410.21353
The ability to locate an object in an image according to natural language instructions is crucial for many real-world applications. In this work we propose LocateBench, a high-quality benchmark dedicated to evaluating this ability. We experiment with
Externí odkaz:
http://arxiv.org/abs/2410.19808
Autor:
Padlewski, Piotr, Bain, Max, Henderson, Matthew, Zhu, Zhongkai, Relan, Nishant, Pham, Hai, Ong, Donovan, Aleksiev, Kaloyan, Ormazabal, Aitor, Phua, Samuel, Yeo, Ethan, Lamprecht, Eugenie, Liu, Qi, Wang, Yuqi, Chen, Eric, Fu, Deyu, Li, Lei, Zheng, Che, d'Autume, Cyprien de Masson, Yogatama, Dani, Artetxe, Mikel, Tay, Yi
We introduce Vibe-Eval: a new open benchmark and framework for evaluating multimodal chat models. Vibe-Eval consists of 269 visual understanding prompts, including 100 of hard difficulty, complete with gold-standard responses authored by experts. Vib
Externí odkaz:
http://arxiv.org/abs/2405.02287
Autor:
Reka Team, Ormazabal, Aitor, Zheng, Che, d'Autume, Cyprien de Masson, Yogatama, Dani, Fu, Deyu, Ong, Donovan, Chen, Eric, Lamprecht, Eugenie, Pham, Hai, Ong, Isaac, Aleksiev, Kaloyan, Li, Lei, Henderson, Matthew, Bain, Max, Artetxe, Mikel, Relan, Nishant, Padlewski, Piotr, Liu, Qi, Chen, Ren, Phua, Samuel, Yang, Yazheng, Tay, Yi, Wang, Yuqi, Zhu, Zhongkai, Xie, Zhihui
We introduce Reka Core, Flash, and Edge, a series of powerful multimodal language models trained from scratch by Reka. Reka models are able to process and reason with text, images, video, and audio inputs. This technical report discusses details of t
Externí odkaz:
http://arxiv.org/abs/2404.12387
Autor:
Fu, Deqing, Guo, Ruohao, Khalighinejad, Ghazal, Liu, Ollie, Dhingra, Bhuwan, Yogatama, Dani, Jia, Robin, Neiswanger, Willie
Current foundation models exhibit impressive capabilities when prompted either with text only or with both image and text inputs. But do their capabilities change depending on the input modality? In this work, we propose $\textbf{IsoBench}$, a benchm
Externí odkaz:
http://arxiv.org/abs/2404.01266
Autor:
Chiang, Ting-Rui, Yogatama, Dani
Many existing theoretical analyses of in-context learning for natural language processing are based on latent variable models that leaves gaps between theory and practice. We aim to close these gaps by proposing a theoretical framework, the Pelican S
Externí odkaz:
http://arxiv.org/abs/2402.10424
The potential of large language models (LLMs) as decision support tools is increasingly being explored in fields such as business, engineering, and medicine, which often face challenging tasks of decision-making under uncertainty. In this paper, we s
Externí odkaz:
http://arxiv.org/abs/2402.02392
Autor:
Chiang, Ting-Rui, Yu, Xinyan Velocity, Robinson, Joshua, Liu, Ollie, Lee, Isabelle, Yogatama, Dani
Augmenting a language model (LM) with $k$-nearest neighbors ($k$NN) retrieval on its training data alone can decrease its perplexity, though the underlying reasons for this remain elusive. In this work, we rule out one previously posited possibility
Externí odkaz:
http://arxiv.org/abs/2311.09615
Autor:
Chiang, Ting-Rui, Yogatama, Dani
We analyze the masked language modeling pretraining objective function from the perspective of the distributional hypothesis. We investigate whether better sample efficiency and the better generalization capability of models pretrained with masked la
Externí odkaz:
http://arxiv.org/abs/2310.16261