Zobrazeno 1 - 10
of 8 287
pro vyhledávání: '"A. Catanzaro"'
Autor:
Messina, S., Catanzaro, G., Lanza, A. F., Gandolfi, D., Serrano, M. M., Deeg, H. J., Garcia-Alvarez, D.
RACE-OC (Rotation and ACtivity Evolution in Open Clusters) is a project aimed at characterising the rotational and magnetic activity properties of the late-type members of open clusters, stellar associations, and moving groups of different ages. As p
Externí odkaz:
http://arxiv.org/abs/2408.16328
Autor:
Shi, Min, Liu, Fuxiao, Wang, Shihao, Liao, Shijia, Radhakrishnan, Subhashree, Huang, De-An, Yin, Hongxu, Sapra, Karan, Yacoob, Yaser, Shi, Humphrey, Catanzaro, Bryan, Tao, Andrew, Kautz, Jan, Yu, Zhiding, Liu, Guilin
The ability to accurately interpret complex visual information is a crucial topic of multimodal large language models (MLLMs). Recent work indicates that enhanced visual perception significantly reduces hallucinations and improves performance on reso
Externí odkaz:
http://arxiv.org/abs/2408.15998
Autor:
Sreenivas, Sharath Turuvekere, Muralidharan, Saurav, Joshi, Raviraj, Chochowski, Marcin, Patwary, Mostofa, Shoeybi, Mohammad, Catanzaro, Bryan, Kautz, Jan, Molchanov, Pavlo
We present a comprehensive report on compressing the Llama 3.1 8B and Mistral NeMo 12B models to 4B and 8B parameters, respectively, using pruning and distillation. We explore two distinct pruning strategies: (1) depth pruning and (2) joint hidden/at
Externí odkaz:
http://arxiv.org/abs/2408.11796
Large Language Models (LLMs) show promise in code generation tasks. However, their code-writing abilities are often limited in scope: while they can successfully implement simple functions, they struggle with more complex tasks. A fundamental differe
Externí odkaz:
http://arxiv.org/abs/2407.19055
Autor:
Muralidharan, Saurav, Sreenivas, Sharath Turuvekere, Joshi, Raviraj, Chochowski, Marcin, Patwary, Mostofa, Shoeybi, Mohammad, Catanzaro, Bryan, Kautz, Jan, Molchanov, Pavlo
Large language models (LLMs) targeting different deployment scales and sizes are currently produced by training each variant from scratch; this is extremely compute-intensive. In this paper, we investigate if pruning an existing LLM and then re-train
Externí odkaz:
http://arxiv.org/abs/2407.14679
In this work, we introduce ChatQA 2, a Llama3-based model designed to bridge the gap between open-access LLMs and leading proprietary models (e.g., GPT-4-Turbo) in long-context understanding and retrieval-augmented generation (RAG) capabilities. Thes
Externí odkaz:
http://arxiv.org/abs/2407.14482
As language models have scaled both their number of parameters and pretraining dataset sizes, the computational cost for pretraining has become intractable except for the most well-resourced teams. This increasing cost makes it ever more important to
Externí odkaz:
http://arxiv.org/abs/2407.07263
Autor:
Parmar, Jupinder, Prabhumoye, Shrimai, Jennings, Joseph, Liu, Bo, Jhunjhunwala, Aastha, Wang, Zhilin, Patwary, Mostofa, Shoeybi, Mohammad, Catanzaro, Bryan
The impressive capabilities of recent language models can be largely attributed to the multi-trillion token pretraining datasets that they are trained on. However, model developers fail to disclose their construction methodology which has lead to a l
Externí odkaz:
http://arxiv.org/abs/2407.06380
Autor:
Yu, Yue, Ping, Wei, Liu, Zihan, Wang, Boxin, You, Jiaxuan, Zhang, Chao, Shoeybi, Mohammad, Catanzaro, Bryan
Large language models (LLMs) typically utilize the top-k contexts from a retriever in retrieval-augmented generation (RAG). In this work, we propose a novel instruction fine-tuning framework RankRAG, which instruction-tunes a single LLM for the dual
Externí odkaz:
http://arxiv.org/abs/2407.02485
Autor:
Kong, Zhifeng, Lee, Sang-gil, Ghosal, Deepanway, Majumder, Navonil, Mehrish, Ambuj, Valle, Rafael, Poria, Soujanya, Catanzaro, Bryan
It is an open challenge to obtain high quality training data, especially captions, for text-to-audio models. Although prior methods have leveraged \textit{text-only language models} to augment and improve captions, such methods have limitations relat
Externí odkaz:
http://arxiv.org/abs/2406.15487