Zobrazeno 1 - 10
of 28
pro vyhledávání: '"Cui, Hugo"'
A key property of neural networks is their capacity of adapting to data during training. Yet, our current mathematical understanding of feature learning and its relationship to generalization remain limited. In this work, we provide a random matrix a
Externí odkaz:
http://arxiv.org/abs/2410.18938
Autor:
Cui, Hugo
Recent years have been marked with the fast-pace diversification and increasing ubiquity of machine learning applications. Yet, a firm theoretical understanding of the surprising efficiency of neural networks to learn from high-dimensional data still
Externí odkaz:
http://arxiv.org/abs/2409.13904
For a large class of feature maps we provide a tight asymptotic characterisation of the test error associated with learning the readout layer, in the high-dimensional limit where the input dimension, hidden layer widths, and number of training sample
Externí odkaz:
http://arxiv.org/abs/2402.13999
Autor:
Cui, Hugo, Pesce, Luca, Dandi, Yatin, Krzakala, Florent, Lu, Yue M., Zdeborová, Lenka, Loureiro, Bruno
Publikováno v:
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:9662-9695, 2024
In this manuscript, we investigate the problem of how two-layer neural networks learn features from data, and improve over the kernel regime, after being trained with a single gradient descent step. Leveraging the insight from (Ba et al., 2022), we m
Externí odkaz:
http://arxiv.org/abs/2402.04980
Many empirical studies have provided evidence for the emergence of algorithmic mechanisms (abilities) in the learning of language models, that lead to qualitative improvements of the model capabilities. Yet, a theoretical characterization of how such
Externí odkaz:
http://arxiv.org/abs/2402.03902
Publikováno v:
ICLR 2024
We study the problem of training a flow-based generative model, parametrized by a two-layer autoencoder, to sample from a high-dimensional Gaussian mixture. We provide a sharp end-to-end analysis of the problem. First, we provide a tight closed-form
Externí odkaz:
http://arxiv.org/abs/2310.03575
Autor:
Cui, Hugo, Zdeborová, Lenka
Publikováno v:
Advances in Neural Information Processing Systems 36 (2023)
We address the problem of denoising data from a Gaussian mixture using a two-layer non-linear autoencoder with tied weights and a skip connection. We consider the high-dimensional limit where the number of training samples and the input dimension joi
Externí odkaz:
http://arxiv.org/abs/2305.11041
This manuscript considers the problem of learning a random Gaussian network function using a fully connected network with frozen intermediate layers and trainable readout layer. This problem can be seen as a natural generalization of the widely studi
Externí odkaz:
http://arxiv.org/abs/2302.00401
Publikováno v:
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:6468-6521, 2023
We consider the problem of learning a target function corresponding to a deep, extensive-width, non-linear neural network with random Gaussian weights. We consider the asymptotic limit where the number of samples, the input dimension and the network
Externí odkaz:
http://arxiv.org/abs/2302.00375
Publikováno v:
Mach. Learn.: Sci. Technol. (2023) 4 035033
We consider the problem of kernel classification. While worst-case bounds on the decay rate of the prediction error with the number of samples are known for some classifiers, they often fail to accurately describe the learning curves of real data set
Externí odkaz:
http://arxiv.org/abs/2201.12655