Zobrazeno 1 - 10
of 41
pro vyhledávání: '"Misiakiewicz, Theodor"'
The goal of this paper is to investigate the complexity of gradient algorithms when learning sparse functions (juntas). We introduce a type of Statistical Queries ($\mathsf{SQ}$), which we call Differentiable Learning Queries ($\mathsf{DLQ}$), to mod
Externí odkaz:
http://arxiv.org/abs/2407.05622
In this work we investigate the generalization performance of random feature ridge regression (RFRR). Our main contribution is a general deterministic equivalent for the test error of RFRR. Specifically, under a certain concentration property, we sho
Externí odkaz:
http://arxiv.org/abs/2405.15699
Autor:
Misiakiewicz, Theodor, Saeed, Basil
We consider learning an unknown target function $f_*$ using kernel ridge regression (KRR) given i.i.d. data $(u_i,y_i)$, $i\leq n$, where $u_i \in U$ is a covariate vector and $y_i = f_* (u_i) +\varepsilon_i \in \mathbb{R}$. A recent string of work h
Externí odkaz:
http://arxiv.org/abs/2403.08938
Recent advances in machine learning have been achieved by using overparametrized models trained until near interpolation of the training data. It was shown, e.g., through the double descent phenomenon, that the number of parameters is a poor proxy fo
Externí odkaz:
http://arxiv.org/abs/2403.08160
In these six lectures, we examine what can be learnt about the behavior of multi-layer neural networks from the analysis of linear models. We first recall the correspondence between neural networks and linear models via the so-called lazy regime. We
Externí odkaz:
http://arxiv.org/abs/2308.13431
We investigate the time complexity of SGD learning on fully-connected neural networks with isotropic data. We put forward a complexity measure -- the leap -- which measures how "hierarchical" target functions are. For $d$-dimensional uniform Boolean
Externí odkaz:
http://arxiv.org/abs/2302.11055
As modern machine learning models continue to advance the computational frontier, it has become increasingly important to develop precise estimates for expected performance improvements under different model and data scaling regimes. Currently, theor
Externí odkaz:
http://arxiv.org/abs/2205.14846
Autor:
Misiakiewicz, Theodor
We study the spectrum of inner-product kernel matrices, i.e., $n \times n$ matrices with entries $h (\langle \textbf{x}_i ,\textbf{x}_j \rangle/d)$ where the $( \textbf{x}_i)_{i \leq n}$ are i.i.d.~random covariates in $\mathbb{R}^d$. In the linear h
Externí odkaz:
http://arxiv.org/abs/2204.10425
It is currently known how to characterize functions that neural networks can learn with SGD for two extremal parameterizations: neural networks in the linear regime, and neural networks with no structural constraints. However, for the main parametriz
Externí odkaz:
http://arxiv.org/abs/2202.08658
Autor:
Misiakiewicz, Theodor, Mei, Song
Recent empirical work has shown that hierarchical convolutional kernels inspired by convolutional neural networks (CNNs) significantly improve the performance of kernel methods in image classification tasks. A widely accepted explanation for their su
Externí odkaz:
http://arxiv.org/abs/2111.08308