Zobrazeno 1 - 10
of 31
pro vyhledávání: '"Poli, Iacopo"'
Optical training of large-scale Transformers and deep neural networks with direct feedback alignment
Autor:
Wang, Ziao, Müller, Kilian, Filipovich, Matthew, Launay, Julien, Ohana, Ruben, Pariente, Gustave, Mokaadi, Safa, Brossollet, Charles, Moreau, Fabien, Cappelli, Alessandro, Poli, Iacopo, Carron, Igor, Daudet, Laurent, Krzakala, Florent, Gigan, Sylvain
Modern machine learning relies nearly exclusively on dedicated electronic hardware accelerators. Photonic approaches, with low consumption and high operation speed, are increasingly considered for inference but, to date, remain mostly limited to rela
Externí odkaz:
http://arxiv.org/abs/2409.12965
In this work we introduce RITA: a suite of autoregressive generative models for protein sequences, with up to 1.2 billion parameters, trained on over 280 million protein sequences belonging to the UniRef-100 database. Such generative models hold the
Externí odkaz:
http://arxiv.org/abs/2205.05789
Autor:
Launay, Julien, Tommasone, Elena, Pannier, Baptiste, Boniface, François, Chatelain, Amélie, Cappelli, Alessandro, Poli, Iacopo, Seddah, Djamé
Access to large pre-trained models of varied architectures, in many different languages, is central to the democratization of NLP. We introduce PAGnol, a collection of French GPT models. Using scaling laws, we efficiently train PAGnol-XL (1.5B parame
Externí odkaz:
http://arxiv.org/abs/2110.08554
Recent work has identified simple empirical scaling laws for language models, linking compute budget, dataset size, model size, and autoregressive modeling loss. The validity of these simple power laws across orders of magnitude in model scale provid
Externí odkaz:
http://arxiv.org/abs/2109.11928
Autor:
Brossollet, Charles, Cappelli, Alessandro, Carron, Igor, Chaintoutis, Charidimos, Chatelain, Amélie, Daudet, Laurent, Gigan, Sylvain, Hesslow, Daniel, Krzakala, Florent, Launay, Julien, Mokaadi, Safa, Moreau, Fabien, Müller, Kilian, Ohana, Ruben, Pariente, Gustave, Poli, Iacopo, Tommasone, Elena
We introduce LightOn's Optical Processing Unit (OPU), the first photonic AI accelerator chip available on the market for at-scale Non von Neumann computations, reaching 1500 TeraOPS. It relies on a combination of free-space optics with off-the-shelf
Externí odkaz:
http://arxiv.org/abs/2107.11814
Robustness to adversarial attacks is typically obtained through expensive adversarial training with Projected Gradient Descent. Here we introduce ROPUST, a remarkably simple and efficient method to leverage robust pre-trained models and further incre
Externí odkaz:
http://arxiv.org/abs/2108.04217
Autor:
Ohana, Ruben, Ruiz, Hamlet J. Medina, Launay, Julien, Cappelli, Alessandro, Poli, Iacopo, Ralaivola, Liva, Rakotomamonjy, Alain
Publikováno v:
NeurIPS 2021
Optical Processing Units (OPUs) -- low-power photonic chips dedicated to large scale random projections -- have been used in previous work to train deep neural networks using Direct Feedback Alignment (DFA), an effective alternative to backpropagatio
Externí odkaz:
http://arxiv.org/abs/2106.03645
Autor:
Hesslow, Daniel, Cappelli, Alessandro, Carron, Igor, Daudet, Laurent, Lafargue, Raphaël, Müller, Kilian, Ohana, Ruben, Pariente, Gustave, Poli, Iacopo
Randomized Numerical Linear Algebra (RandNLA) is a powerful class of methods, widely used in High Performance Computing (HPC). RandNLA provides approximate solutions to linear algebra functions applied to large signals, at reduced computational costs
Externí odkaz:
http://arxiv.org/abs/2104.14429
Autor:
Hesslow, Daniel, Poli, Iacopo
The performance of algorithms for neural architecture search strongly depends on the parametrization of the search space. We use contrastive learning to identify networks across different initializations based on their data Jacobians, and automatical
Externí odkaz:
http://arxiv.org/abs/2102.04208
Autor:
Cappelli, Alessandro, Ohana, Ruben, Launay, Julien, Meunier, Laurent, Poli, Iacopo, Krzakala, Florent
Publikováno v:
ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing
We propose a new defense mechanism against adversarial attacks inspired by an optical co-processor, providing robustness without compromising natural accuracy in both white-box and black-box settings. This hardware co-processor performs a nonlinear f
Externí odkaz:
http://arxiv.org/abs/2101.02115