Zobrazeno 1 - 10
of 856
pro vyhledávání: '"Tau, P."'
Autor:
Liang, Weixin, Yu, Lili, Luo, Liang, Iyer, Srinivasan, Dong, Ning, Zhou, Chunting, Ghosh, Gargi, Lewis, Mike, Yih, Wen-tau, Zettlemoyer, Luke, Lin, Xi Victoria
The development of large language models (LLMs) has expanded to multi-modal systems capable of processing text, images, and speech within a unified framework. Training these models demands significantly larger datasets and computational resources com
Externí odkaz:
http://arxiv.org/abs/2411.04996
Autor:
Xu, Hu, Huang, Po-Yao, Tan, Xiaoqing Ellen, Yeh, Ching-Feng, Kahn, Jacob, Jou, Christine, Ghosh, Gargi, Levy, Omer, Zettlemoyer, Luke, Yih, Wen-tau, Li, Shang-Wen, Xie, Saining, Feichtenhofer, Christoph
This paper focuses on creating synthetic data to improve the quality of image captions. Existing works typically have two shortcomings. First, they caption images from scratch, ignoring existing alt-text metadata, and second, lack transparency if the
Externí odkaz:
http://arxiv.org/abs/2410.17251
Autor:
Yang, Xiao, Sun, Kai, Xin, Hao, Sun, Yushi, Bhalla, Nikita, Chen, Xiangsen, Choudhary, Sajal, Gui, Rongze Daniel, Jiang, Ziran Will, Jiang, Ziyu, Kong, Lingkun, Moran, Brian, Wang, Jiaqi, Xu, Yifan Ethan, Yan, An, Yang, Chenyu, Yuan, Eting, Zha, Hanwen, Tang, Nan, Chen, Lei, Scheffer, Nicolas, Liu, Yue, Shah, Nirav, Wanga, Rakesh, Kumar, Anuj, Yih, Wen-tau, Dong, Xin Luna
Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)'s deficiency in lack of knowledge. Existing RAG datasets, however, do not adequately represent the diverse and dynamic nature of
Externí odkaz:
http://arxiv.org/abs/2406.04744
Autor:
Li, Minghan, Chen, Xilun, Holtzman, Ari, Chen, Beidi, Lin, Jimmy, Yih, Wen-tau, Lin, Xi Victoria
Large language models (LLMs) often hallucinate and lack the ability to provide attribution for their generations. Semi-parametric LMs, such as kNN-LM, approach these limitations by refining the output of an LM for a given prompt using its nearest nei
Externí odkaz:
http://arxiv.org/abs/2405.19325
Autor:
Lin, Sheng-Chieh, Gao, Luyu, Oguz, Barlas, Xiong, Wenhan, Lin, Jimmy, Yih, Wen-tau, Chen, Xilun
Alignment is a standard procedure to fine-tune pre-trained large language models (LLMs) to follow natural language instructions and serve as helpful AI assistants. We have observed, however, that the conventional alignment process fails to enhance th
Externí odkaz:
http://arxiv.org/abs/2405.01525
Autor:
Ma, Jiawei, Huang, Po-Yao, Xie, Saining, Li, Shang-Wen, Zettlemoyer, Luke, Chang, Shih-Fu, Yih, Wen-Tau, Xu, Hu
The success of contrastive language-image pretraining (CLIP) relies on the supervision from the pairing between images and captions, which tends to be noisy in web-crawled data. We present Mixture of Data Experts (MoDE) and learn a system of CLIP dat
Externí odkaz:
http://arxiv.org/abs/2404.16030
Autor:
Sukhbaatar, Sainbayar, Golovneva, Olga, Sharma, Vasu, Xu, Hu, Lin, Xi Victoria, Rozière, Baptiste, Kahn, Jacob, Li, Daniel, Yih, Wen-tau, Weston, Jason, Li, Xian
We investigate efficient methods for training Large Language Models (LLMs) to possess capabilities in multiple specialized domains, such as coding, math reasoning and world knowledge. Our method, named Branch-Train-MiX (BTX), starts from a seed model
Externí odkaz:
http://arxiv.org/abs/2403.07816
Autor:
Asai, Akari, Zhong, Zexuan, Chen, Danqi, Koh, Pang Wei, Zettlemoyer, Luke, Hajishirzi, Hannaneh, Yih, Wen-tau
Parametric language models (LMs), which are trained on vast amounts of web data, exhibit remarkable flexibility and capability. However, they still face practical challenges such as hallucinations, difficulty in adapting to new data distributions, an
Externí odkaz:
http://arxiv.org/abs/2403.03187
Autor:
Jiang, Zhengbao, Sun, Zhiqing, Shi, Weijia, Rodriguez, Pedro, Zhou, Chunting, Neubig, Graham, Lin, Xi Victoria, Yih, Wen-tau, Iyer, Srinivasan
In order for large language model (LLM)-based assistants to effectively adapt to evolving information needs, it must be possible to update their factual knowledge through continued training on new data. The standard recipe for doing so involves conti
Externí odkaz:
http://arxiv.org/abs/2402.12847
For stochastic wave equation, when the dissipative damping is a non-globally Lipschitz function of the velocity, there are few results on the long-time dynamics, in particular, the exponential ergodicity and strong law of large numbers, for the equat
Externí odkaz:
http://arxiv.org/abs/2402.01137