Zobrazeno 1 - 10
of 129
pro vyhledávání: '"Abbé, Emmanuel"'
In 1948, Shannon used a probabilistic argument to show the existence of codes achieving a maximal rate defined by the channel capacity. In 1954, Muller and Reed introduced a simple deterministic code construction, based on polynomial evaluations, con
Externí odkaz:
http://arxiv.org/abs/2411.13493
Learning with identical train and test distributions has been extensively investigated both practically and theoretically. Much remains to be understood, however, in statistical learning under distribution shifts. This paper focuses on a distribution
Externí odkaz:
http://arxiv.org/abs/2410.23461
Modern vision models have achieved remarkable success in benchmarks where local features provide critical information about the target. There is now a growing interest in solving tasks that require more global reasoning, where local features offer no
Externí odkaz:
http://arxiv.org/abs/2410.08165
Can Transformers predict new syllogisms by composing established ones? More generally, what type of targets can be learned by such models from scratch? Recent works show that Transformers can be Turing-complete in terms of expressivity, but this does
Externí odkaz:
http://arxiv.org/abs/2406.06467
We investigate the out-of-domain generalization of random feature (RF) models and Transformers. We first prove that in the `generalization on the unseen (GOTU)' setting, where training data is fully seen in some part of the domain but testing is made
Externí odkaz:
http://arxiv.org/abs/2406.06354
Autor:
Abbe, Emmanuel, Sandon, Colin
This paper shows that a class of codes such as Reed-Muller (RM) codes have vanishing bit-error probability below capacity on symmetric channels. The proof relies on the notion of `camellia codes': a class of symmetric codes decomposable into `camelli
Externí odkaz:
http://arxiv.org/abs/2312.04329
Autor:
Boix-Adsera, Enric, Saremi, Omid, Abbe, Emmanuel, Bengio, Samy, Littwin, Etai, Susskind, Joshua
We investigate the capabilities of transformer models on relational reasoning tasks. In these tasks, models are trained on a set of strings encoding abstract relations, and are then tested out-of-distribution on data that contains symbols that did no
Externí odkaz:
http://arxiv.org/abs/2310.09753
In this work, we introduce Boolformer, the first Transformer architecture trained to perform end-to-end symbolic regression of Boolean functions. First, we show that it can predict compact formulas for complex functions which were not seen during tra
Externí odkaz:
http://arxiv.org/abs/2309.12207
Experimental results have shown that curriculum learning, i.e., presenting simpler examples before more complex ones, can improve the efficiency of learning. Some recent theoretical results also showed that changing the sampling distribution can help
Externí odkaz:
http://arxiv.org/abs/2306.16921
We identify incremental learning dynamics in transformers, where the difference between trained and initial weights progressively increases in rank. We rigorously prove this occurs under the simplifying assumptions of diagonal weight matrices and sma
Externí odkaz:
http://arxiv.org/abs/2306.07042