Výsledky vyhledávání

Report

GitChameleon: Unmasking the Version-Switching Capabilities of Code Generation Models

Autor: Islah, Nizar, Gehring, Justine, Misra, Diganta, Muller, Eilif, Rish, Irina, Zhuo, Terry Yue, Caccia, Massimo

The rapid evolution of software libraries presents a significant challenge for code generation models, which must adapt to frequent version updates while maintaining compatibility with previous versions. Existing code completion benchmarks often over

Externí odkaz: http://arxiv.org/abs/2411.05830

Zobrazit plný text záznamu

Report

On-line Anomaly Detection and Qualification of Random Bit Streams

Autor: Caratozzolo, Cesare, Rossi, Valeria, Witek, Kamil, Trombetta, Alberto, Caccia, Massimo

Generating random bit streams is required in various applications, most notably cyber-security. Ensuring high-quality and robust randomness is crucial to mitigate risks associated with predictability and system compromise. True random numbers provide

Externí odkaz: http://arxiv.org/abs/2409.05543

Zobrazit plný text záznamu

Report

A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning

Autor: Yadav, Prateek, Raffel, Colin, Muqeeth, Mohammed, Caccia, Lucas, Liu, Haokun, Chen, Tianlong, Bansal, Mohit, Choshen, Leshem, Sordoni, Alessandro

The availability of performant pre-trained models has led to a proliferation of fine-tuned expert models that are specialized to a particular domain or task. Model MoErging methods aim to recycle expert models to create an aggregate system with impro

Externí odkaz: http://arxiv.org/abs/2408.07057

Zobrazit plný text záznamu

Report

WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks

Autor: Boisvert, Léo, Thakkar, Megh, Gasse, Maxime, Caccia, Massimo, De Chezelles, Thibault Le Sellier, Cappart, Quentin, Chapados, Nicolas, Lacoste, Alexandre, Drouin, Alexandre

The ability of large language models (LLMs) to mimic human-like intelligence has led to a surge in LLM-based autonomous agents. Though recent LLMs seem capable of planning and reasoning given user instructions, their effectiveness in applying these c

Externí odkaz: http://arxiv.org/abs/2407.05291

Zobrazit plný text záznamu

Report

Towards Modular LLMs by Building and Reusing a Library of LoRAs

Autor: Ostapenko, Oleksiy, Su, Zhan, Ponti, Edoardo Maria, Charlin, Laurent, Roux, Nicolas Le, Pereira, Matheus, Caccia, Lucas, Sordoni, Alessandro

The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trained adapters to improve performance for new tasks. We study how to best build a library of adapters given mult

Externí odkaz: http://arxiv.org/abs/2405.11157

Zobrazit plný text záznamu

Report

WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?

Autor: Drouin, Alexandre, Gasse, Maxime, Caccia, Massimo, Laradji, Issam H., Del Verme, Manuel, Marty, Tom, Boisvert, Léo, Thakkar, Megh, Cappart, Quentin, Vazquez, David, Chapados, Nicolas, Lacoste, Alexandre

We study the use of large language model-based agents for interacting with software via web browsers. Unlike prior work, we focus on measuring the agents' ability to perform tasks that span the typical daily work of knowledge workers utilizing enterp

Externí odkaz: http://arxiv.org/abs/2403.07718

Zobrazit plný text záznamu

Report

Guiding Language Model Reasoning with Planning Tokens

Autor: Wang, Xinyi, Caccia, Lucas, Ostapenko, Oleksiy, Yuan, Xingdi, Wang, William Yang, Sordoni, Alessandro

Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks, such as chain-of-thought (CoT) reasoning. However, most of the existing approaches to enhance this ability rely heavily o

Externí odkaz: http://arxiv.org/abs/2310.05707

Zobrazit plný text záznamu

Report

Guiding The Last Layer in Federated Learning with Pre-Trained Models

Autor: Legate, Gwen, Bernier, Nicolas, Caccia, Lucas, Oyallon, Edouard, Belilovsky, Eugene

Federated Learning (FL) is an emerging paradigm that allows a model to be trained across a number of participants without sharing data. Recent works have begun to consider the effects of using pre-trained models as an initialization point for existin

Externí odkaz: http://arxiv.org/abs/2306.03937

Zobrazit plný text záznamu

Report

Towards Compute-Optimal Transfer Learning

Autor: Caccia, Massimo, Galashov, Alexandre, Douillard, Arthur, Rannen-Triki, Amal, Rao, Dushyant, Paganini, Michela, Charlin, Laurent, Ranzato, Marc'Aurelio, Pascanu, Razvan

The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requirements to

Externí odkaz: http://arxiv.org/abs/2304.13164

Zobrazit plný text záznamu

Report

Re-Weighted Softmax Cross-Entropy to Control Forgetting in Federated Learning

Autor: Legate, Gwen, Caccia, Lucas, Belilovsky, Eugene

In Federated Learning, a global model is learned by aggregating model updates computed at a set of independent client nodes, to reduce communication costs multiple gradient steps are performed at each node prior to aggregation. A key challenge in thi

Externí odkaz: http://arxiv.org/abs/2304.05260

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání