Zobrazeno 1 - 10
of 177
pro vyhledávání: '"Jaitly, Navdeep"'
Autor:
Sampaio, Georgia Gabriela, Zhang, Ruixiang, Zhai, Shuangfei, Gu, Jiatao, Susskind, Josh, Jaitly, Navdeep, Zhang, Yizhe
Evaluating text-to-image generative models remains a challenge, despite the remarkable progress being made in their overall performances. While existing metrics like CLIPScore work for coarse evaluations, they lack the sensitivity to distinguish fine
Externí odkaz:
http://arxiv.org/abs/2411.02437
Large pretrained vision-language models like CLIP have shown promising generalization capability, but may struggle in specialized domains (e.g., satellite imagery) or fine-grained classification (e.g., car models) where the visual concepts are unseen
Externí odkaz:
http://arxiv.org/abs/2410.23698
Autor:
Gu, Jiatao, Wang, Yuyang, Zhang, Yizhe, Zhang, Qihang, Zhang, Dinghuai, Jaitly, Navdeep, Susskind, Josh, Zhai, Shuangfei
Diffusion models have become the dominant approach for visual generation. They are trained by denoising a Markovian process that gradually adds noise to the input. We argue that the Markovian property limits the models ability to fully utilize the ge
Externí odkaz:
http://arxiv.org/abs/2410.08159
Autor:
D'Ambrosio, David B., Abeyruwan, Saminda, Graesser, Laura, Iscen, Atil, Amor, Heni Ben, Bewley, Alex, Reed, Barney J., Reymann, Krista, Takayama, Leila, Tassa, Yuval, Choromanski, Krzysztof, Coumans, Erwin, Jain, Deepali, Jaitly, Navdeep, Jaques, Natasha, Kataoka, Satoshi, Kuang, Yuheng, Lazic, Nevena, Mahjourian, Reza, Moore, Sherry, Oslund, Kenneth, Shankar, Anish, Sindhwani, Vikas, Vanhoucke, Vincent, Vesom, Grace, Xu, Peng, Sanketi, Pannag R.
Achieving human-level speed and performance on real world tasks is a north star for the robotics research community. This work takes a step towards that goal and presents the first learned robot agent that reaches amateur human-level performance in c
Externí odkaz:
http://arxiv.org/abs/2408.03906
Autor:
Bai, He, Likhomanenko, Tatiana, Zhang, Ruixiang, Gu, Zijin, Aldeneh, Zakaria, Jaitly, Navdeep
Large language models have revolutionized natural language processing by leveraging self-supervised pretraining on vast textual data. Inspired by this success, researchers have investigated complicated speech tokenization methods to discretize contin
Externí odkaz:
http://arxiv.org/abs/2407.15835
Autor:
Zhang, Dinghuai, Zhang, Yizhe, Gu, Jiatao, Zhang, Ruixiang, Susskind, Josh, Jaitly, Navdeep, Zhai, Shuangfei
Diffusion models have become the de-facto approach for generating visual data, which are trained to match the distribution of the training dataset. In addition, we also want to control generation to fulfill desired properties such as alignment to a t
Externí odkaz:
http://arxiv.org/abs/2406.00633
Diffusion models have emerged as a powerful tool for generating high-quality images from textual descriptions. Despite their successes, these models often exhibit limited diversity in the sampled images, particularly when sampling with a high classif
Externí odkaz:
http://arxiv.org/abs/2405.21048
Autor:
Gu, Zijin, Likhomanenko, Tatiana, Bai, He, McDermott, Erik, Collobert, Ronan, Jaitly, Navdeep
Language models (LMs) have long been used to improve results of automatic speech recognition (ASR) systems, but they are unaware of the errors that ASR systems make. Error correction models are designed to fix ASR errors, however, they showed little
Externí odkaz:
http://arxiv.org/abs/2405.15216
Autor:
Zhang, Yizhe, Bai, He, Zhang, Ruixiang, Gu, Jiatao, Zhai, Shuangfei, Susskind, Josh, Jaitly, Navdeep
Vision-Language Models (VLMs) have recently demonstrated incredible strides on diverse vision language tasks. We dig into vision-based deductive reasoning, a more sophisticated but less explored realm, and find previously unexposed blindspots in the
Externí odkaz:
http://arxiv.org/abs/2403.04732
Autor:
Wu, Zhuofeng, Bai, He, Zhang, Aonan, Gu, Jiatao, Vydiswaran, VG Vinod, Jaitly, Navdeep, Zhang, Yizhe
Recent methods have demonstrated that Large Language Models (LLMs) can solve reasoning tasks better when they are encouraged to solve subtasks of the main task first. In this paper we devise a similar strategy that breaks down reasoning tasks into a
Externí odkaz:
http://arxiv.org/abs/2402.15000