Zobrazeno 1 - 10
of 34
pro vyhledávání: '"Caccia, Lucas"'
Autor:
Yadav, Prateek, Raffel, Colin, Muqeeth, Mohammed, Caccia, Lucas, Liu, Haokun, Chen, Tianlong, Bansal, Mohit, Choshen, Leshem, Sordoni, Alessandro
The availability of performant pre-trained models has led to a proliferation of fine-tuned expert models that are specialized to a particular domain or task. Model MoErging methods aim to recycle expert models to create an aggregate system with impro
Externí odkaz:
http://arxiv.org/abs/2408.07057
Autor:
Ostapenko, Oleksiy, Su, Zhan, Ponti, Edoardo Maria, Charlin, Laurent, Roux, Nicolas Le, Pereira, Matheus, Caccia, Lucas, Sordoni, Alessandro
The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trained adapters to improve performance for new tasks. We study how to best build a library of adapters given mult
Externí odkaz:
http://arxiv.org/abs/2405.11157
Autor:
Wang, Xinyi, Caccia, Lucas, Ostapenko, Oleksiy, Yuan, Xingdi, Wang, William Yang, Sordoni, Alessandro
Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks, such as chain-of-thought (CoT) reasoning. However, most of the existing approaches to enhance this ability rely heavily o
Externí odkaz:
http://arxiv.org/abs/2310.05707
Federated Learning (FL) is an emerging paradigm that allows a model to be trained across a number of participants without sharing data. Recent works have begun to consider the effects of using pre-trained models as an initialization point for existin
Externí odkaz:
http://arxiv.org/abs/2306.03937
In Federated Learning, a global model is learned by aggregating model updates computed at a set of independent client nodes, to reduce communication costs multiple gradient steps are performed at each node prior to aggregation. A key challenge in thi
Externí odkaz:
http://arxiv.org/abs/2304.05260
Autor:
Gaya, Jean-Baptiste, Doan, Thang, Caccia, Lucas, Soulier, Laure, Denoyer, Ludovic, Raileanu, Roberta
The ability to continuously acquire new knowledge and skills is crucial for autonomous agents. Existing methods are typically based on either fixed-size models that struggle to learn a large number of diverse behaviors, or growing-size models that sc
Externí odkaz:
http://arxiv.org/abs/2211.10445
Autor:
Caccia, Lucas, Ponti, Edoardo, Su, Zhan, Pereira, Matheus, Roux, Nicolas Le, Sordoni, Alessandro
Parameter-efficient fine-tuning (PEFT) for cross-task generalization consists in pre-training adapters on a multi-task training set before few-shot adaptation to test tasks. Polytropon [Ponti et al., 2023] ($\texttt{Poly}$) jointly learns an inventor
Externí odkaz:
http://arxiv.org/abs/2211.03831
Autor:
Caccia, Lucas, Aljundi, Rahaf, Asadi, Nader, Tuytelaars, Tinne, Pineau, Joelle, Belilovsky, Eugene
In the online continual learning paradigm, agents must learn from a changing distribution while respecting memory and compute constraints. Experience Replay (ER), where a small subset of past data is stored and replayed alongside new data, has emerge
Externí odkaz:
http://arxiv.org/abs/2203.03798
In many practical applications of machine learning data arrives sequentially over time in large chunks. Practitioners have then to decide how to allocate their computational budget in order to obtain the best performance at any point in time. Online
Externí odkaz:
http://arxiv.org/abs/2106.09563
Autor:
Caccia, Lucas, Pineau, Joelle
This paper presents SPeCiaL: a method for unsupervised pretraining of representations tailored for continual learning. Our approach devises a meta-learning objective that differentiates through a sequential learning process. Specifically, we train a
Externí odkaz:
http://arxiv.org/abs/2106.09065