Zobrazeno 1 - 10
of 2 172
pro vyhledávání: '"Caccia, P."'
Autor:
Islah, Nizar, Gehring, Justine, Misra, Diganta, Muller, Eilif, Rish, Irina, Zhuo, Terry Yue, Caccia, Massimo
The rapid evolution of software libraries presents a significant challenge for code generation models, which must adapt to frequent version updates while maintaining compatibility with previous versions. Existing code completion benchmarks often over
Externí odkaz:
http://arxiv.org/abs/2411.05830
Generating random bit streams is required in various applications, most notably cyber-security. Ensuring high-quality and robust randomness is crucial to mitigate risks associated with predictability and system compromise. True random numbers provide
Externí odkaz:
http://arxiv.org/abs/2409.05543
Autor:
Yadav, Prateek, Raffel, Colin, Muqeeth, Mohammed, Caccia, Lucas, Liu, Haokun, Chen, Tianlong, Bansal, Mohit, Choshen, Leshem, Sordoni, Alessandro
The availability of performant pre-trained models has led to a proliferation of fine-tuned expert models that are specialized to a particular domain or task. Model MoErging methods aim to recycle expert models to create an aggregate system with impro
Externí odkaz:
http://arxiv.org/abs/2408.07057
Autor:
Boisvert, Léo, Thakkar, Megh, Gasse, Maxime, Caccia, Massimo, De Chezelles, Thibault Le Sellier, Cappart, Quentin, Chapados, Nicolas, Lacoste, Alexandre, Drouin, Alexandre
The ability of large language models (LLMs) to mimic human-like intelligence has led to a surge in LLM-based autonomous agents. Though recent LLMs seem capable of planning and reasoning given user instructions, their effectiveness in applying these c
Externí odkaz:
http://arxiv.org/abs/2407.05291
Autor:
Ostapenko, Oleksiy, Su, Zhan, Ponti, Edoardo Maria, Charlin, Laurent, Roux, Nicolas Le, Pereira, Matheus, Caccia, Lucas, Sordoni, Alessandro
The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trained adapters to improve performance for new tasks. We study how to best build a library of adapters given mult
Externí odkaz:
http://arxiv.org/abs/2405.11157
Autor:
Drouin, Alexandre, Gasse, Maxime, Caccia, Massimo, Laradji, Issam H., Del Verme, Manuel, Marty, Tom, Boisvert, Léo, Thakkar, Megh, Cappart, Quentin, Vazquez, David, Chapados, Nicolas, Lacoste, Alexandre
We study the use of large language model-based agents for interacting with software via web browsers. Unlike prior work, we focus on measuring the agents' ability to perform tasks that span the typical daily work of knowledge workers utilizing enterp
Externí odkaz:
http://arxiv.org/abs/2403.07718
Autor:
Wang, Xinyi, Caccia, Lucas, Ostapenko, Oleksiy, Yuan, Xingdi, Wang, William Yang, Sordoni, Alessandro
Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks, such as chain-of-thought (CoT) reasoning. However, most of the existing approaches to enhance this ability rely heavily o
Externí odkaz:
http://arxiv.org/abs/2310.05707
Federated Learning (FL) is an emerging paradigm that allows a model to be trained across a number of participants without sharing data. Recent works have begun to consider the effects of using pre-trained models as an initialization point for existin
Externí odkaz:
http://arxiv.org/abs/2306.03937
Autor:
Caccia, Massimo, Galashov, Alexandre, Douillard, Arthur, Rannen-Triki, Amal, Rao, Dushyant, Paganini, Michela, Charlin, Laurent, Ranzato, Marc'Aurelio, Pascanu, Razvan
The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requirements to
Externí odkaz:
http://arxiv.org/abs/2304.13164
In Federated Learning, a global model is learned by aggregating model updates computed at a set of independent client nodes, to reduce communication costs multiple gradient steps are performed at each node prior to aggregation. A key challenge in thi
Externí odkaz:
http://arxiv.org/abs/2304.05260