Zobrazeno 1 - 10
of 58
pro vyhledávání: '"Oguz, Barlas"'
Autor:
Lin, Sheng-Chieh, Gao, Luyu, Oguz, Barlas, Xiong, Wenhan, Lin, Jimmy, Yih, Wen-tau, Chen, Xilun
Alignment is a standard procedure to fine-tune pre-trained large language models (LLMs) to follow natural language instructions and serve as helpful AI assistants. We have observed, however, that the conventional alignment process fails to enhance th
Externí odkaz:
http://arxiv.org/abs/2405.01525
The study explores the effectiveness of the Chain-of-Thought approach, known for its proficiency in language tasks by breaking them down into sub-tasks and intermediate steps, in improving vision-language tasks that demand sophisticated perception an
Externí odkaz:
http://arxiv.org/abs/2311.09193
In recent years, advances in the large-scale pretraining of language and text-to-image models have revolutionized the field of machine learning. Yet, integrating these two modalities into a single, robust model capable of generating seamless multimod
Externí odkaz:
http://arxiv.org/abs/2309.15564
Autor:
Xiong, Wenhan, Liu, Jingyu, Molybog, Igor, Zhang, Hejia, Bhargava, Prajjwal, Hou, Rui, Martin, Louis, Rungta, Rashi, Sankararaman, Karthik Abinav, Oguz, Barlas, Khabsa, Madian, Fang, Han, Mehdad, Yashar, Narang, Sharan, Malik, Kshitiz, Fan, Angela, Bhosale, Shruti, Edunov, Sergey, Lewis, Mike, Wang, Sinong, Ma, Hao
We present a series of long-context LLMs that support effective context windows of up to 32,768 tokens. Our model series are built through continual pretraining from Llama 2 with longer training sequences and on a dataset where long texts are upsampl
Externí odkaz:
http://arxiv.org/abs/2309.16039
Autor:
Jawahar, Ganesh, Yang, Haichuan, Xiong, Yunyang, Liu, Zechun, Wang, Dilin, Sun, Fei, Li, Meng, Pappu, Aasish, Oguz, Barlas, Abdul-Mageed, Muhammad, Lakshmanan, Laks V. S., Krishnamoorthi, Raghuraman, Chandra, Vikas
Weight-sharing supernets are crucial for performance estimation in cutting-edge neural architecture search (NAS) frameworks. Despite their ability to generate diverse subnetworks without retraining, the quality of these subnetworks is not guaranteed
Externí odkaz:
http://arxiv.org/abs/2306.04845
Ternary and binary neural networks enable multiplication-free computation and promise multiple orders of magnitude efficiency gains over full-precision networks if implemented on specialized hardware. However, since both the parameter and the output
Externí odkaz:
http://arxiv.org/abs/2306.01841
Autor:
Liu, Zechun, Oguz, Barlas, Zhao, Changsheng, Chang, Ernie, Stock, Pierre, Mehdad, Yashar, Shi, Yangyang, Krishnamoorthi, Raghuraman, Chandra, Vikas
Several post-training quantization methods have been applied to large language models (LLMs), and have been shown to perform well down to 8-bits. We find that these methods break down at lower bit precision, and investigate quantization aware trainin
Externí odkaz:
http://arxiv.org/abs/2305.17888
3D human modeling has been widely used for engaging interaction in gaming, film, and animation. The customization of these characters is crucial for creativity and scalability, which highlights the importance of controllability. In this work, we intr
Externí odkaz:
http://arxiv.org/abs/2305.14312
We propose a new two-stage pre-training framework for video-to-text generation tasks such as video captioning and video question answering: A generative encoder-decoder model is first jointly pre-trained on massive image-text data to learn fundamenta
Externí odkaz:
http://arxiv.org/abs/2305.03204
Autor:
Zala, Abhay, Cho, Jaemin, Kottur, Satwik, Chen, Xilun, Oğuz, Barlas, Mehdad, Yasher, Bansal, Mohit
There is growing interest in searching for information from large video corpora. Prior works have studied relevant tasks, such as text-based video retrieval, moment retrieval, video summarization, and video captioning in isolation, without an end-to-
Externí odkaz:
http://arxiv.org/abs/2303.16406