Zobrazeno 1 - 10
of 20
pro vyhledávání: '"Troiani, Emanuele"'
Current progress in artificial intelligence is centered around so-called large language models that consist of neural networks processing long sequences of high-dimensional vectors called tokens. Statistical physics provides powerful tools to study t
Externí odkaz:
http://arxiv.org/abs/2410.18858
Autor:
Borges, Beatriz, Foroutan, Negar, Bayazit, Deniz, Sotnikova, Anna, Montariol, Syrielle, Nazaretzky, Tanya, Banaei, Mohammadreza, Sakhaeirad, Alireza, Servant, Philippe, Neshaei, Seyed Parsa, Frej, Jibril, Romanou, Angelika, Weiss, Gail, Mamooler, Sepideh, Chen, Zeming, Fan, Simin, Gao, Silin, Ismayilzada, Mete, Paul, Debjit, Schöpfer, Alexandre, Janchevski, Andrej, Tiede, Anja, Linden, Clarence, Troiani, Emanuele, Salvi, Francesco, Behrens, Freya, Orsi, Giacomo, Piccioli, Giovanni, Sevel, Hadrien, Coulon, Louis, Pineros-Rodriguez, Manuela, Bonnassies, Marin, Hellich, Pierre, van Gerwen, Puck, Gambhir, Sankalp, Pirelli, Solal, Blanchard, Thomas, Callens, Timothée, Aoun, Toni Abi, Alonso, Yannick Calvino, Cho, Yuri, Chiappa, Alberto, Sclocchi, Antonio, Bruno, Étienne, Hofhammer, Florian, Pescia, Gabriel, Rizk, Geovani, Dadi, Leello, Stoffl, Lucas, Ribeiro, Manoel Horta, Bovel, Matthieu, Pan, Yueyang, Radenovic, Aleksandra, Alahi, Alexandre, Mathis, Alexander, Bitbol, Anne-Florence, Faltings, Boi, Hébert, Cécile, Tuia, Devis, Maréchal, François, Candea, George, Carleo, Giuseppe, Chappelier, Jean-Cédric, Flammarion, Nicolas, Fürbringer, Jean-Marie, Pellet, Jean-Philippe, Aberer, Karl, Zdeborová, Lenka, Salathé, Marcel, Jaggi, Martin, Rajman, Martin, Payer, Mathias, Wyart, Matthieu, Gastpar, Michael, Ceriotti, Michele, Svensson, Ola, Lévêque, Olivier, Ienne, Paolo, Guerraoui, Rachid, West, Robert, Kashyap, Sanidhya, Piazza, Valerio, Simanis, Viesturs, Kuncak, Viktor, Cevher, Volkan, Schwaller, Philippe, Friedli, Sacha, Jermann, Patrick, Kaser, Tanja, Bosselut, Antoine
AI assistants are being increasingly used by students enrolled in higher education institutions. While these tools provide opportunities for improved teaching and education, they also pose significant challenges for assessment and learning outcomes.
Externí odkaz:
http://arxiv.org/abs/2408.11841
We consider the problem of learning a target function corresponding to a single hidden layer neural network, with a quadratic activation function after the first layer, and random weights. We consider the asymptotic limit where the input dimension an
Externí odkaz:
http://arxiv.org/abs/2408.03733
Autor:
Troiani, Emanuele, Dandi, Yatin, Defilippis, Leonardo, Zdeborová, Lenka, Loureiro, Bruno, Krzakala, Florent
Multi-index models - functions which only depend on the covariates through a non-linear transformation of their projection on a subspace - are a useful benchmark for investigating feature learning with neural nets. This paper examines the theoretical
Externí odkaz:
http://arxiv.org/abs/2405.15480
Autor:
Dandi, Yatin, Troiani, Emanuele, Arnaboldi, Luca, Pesce, Luca, Zdeborová, Lenka, Krzakala, Florent
Publikováno v:
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:9991-10016, 2024
We investigate the training dynamics of two-layer neural networks when learning multi-index target functions. We focus on multi-pass gradient descent (GD) that reuses the batches multiple times and show that it significantly changes the conclusion ab
Externí odkaz:
http://arxiv.org/abs/2402.03220
In recent years statistical physics has proven to be a valuable tool to probe into large dimensional inference problems such as the ones occurring in machine learning. Statistical physics provides analytical tools to study fundamental limitations in
Externí odkaz:
http://arxiv.org/abs/2306.16097
Publikováno v:
Journal of Physics A: Mathematical and Theoretical 57.12 (2024): 125002
In this paper, we study sampling from a posterior derived from a neural network. We propose a new probabilistic model consisting of adding noise at every pre- and post-activation in the network, arguing that the resulting posterior can be sampled usi
Externí odkaz:
http://arxiv.org/abs/2306.02729
Publikováno v:
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:811-819, 2024
We study robust linear regression in high-dimension, when both the dimension $d$ and the number of data points $n$ diverge with a fixed ratio $\alpha=n/d$, and study a data model that includes outliers. We provide exact asymptotics for the performanc
Externí odkaz:
http://arxiv.org/abs/2305.18974
Autor:
Gerbelot, Cedric, Troiani, Emanuele, Mignacco, Francesca, Krzakala, Florent, Zdeborova, Lenka
Publikováno v:
SIAM Journal on Mathematics of Data Science 6.2 (2024): 400-427
We prove closed-form equations for the exact high-dimensional asymptotics of a family of first order gradient-based methods, learning an estimator (e.g. M-estimator, shallow neural network, ...) from observations on Gaussian data with empirical risk
Externí odkaz:
http://arxiv.org/abs/2210.06591
Publikováno v:
Proceedings of Mathematical and Scientific Machine Learning (MSML), PMLR 190:97-112, 2022
In this manuscript we consider denoising of large rectangular matrices: given a noisy observation of a signal matrix, what is the best way of recovering the signal matrix itself? For Gaussian noise and rotationally-invariant signal priors, we complet
Externí odkaz:
http://arxiv.org/abs/2203.07752