Zobrazeno 1 - 8
of 8
pro vyhledávání: '"Pesme, Scott"'
We examine the continuous-time counterpart of mirror descent, namely mirror flow, on classification problems which are linearly separable. Such problems are minimised `at infinity' and have many possible solutions; we study which solution is preferre
Externí odkaz:
http://arxiv.org/abs/2406.12763
In this work, we investigate the effect of momentum on the optimisation trajectory of gradient descent. We leverage a continuous-time approach in the analysis of momentum gradient descent with step size $\gamma$ and momentum parameter $\beta$ that al
Externí odkaz:
http://arxiv.org/abs/2403.05293
Autor:
Pesme, Scott, Flammarion, Nicolas
In this paper we fully describe the trajectory of gradient flow over diagonal linear networks in the limit of vanishing initialisation. We show that the limiting flow successively jumps from a saddle of the training loss to another until reaching the
Externí odkaz:
http://arxiv.org/abs/2304.00488
In this paper, we investigate the impact of stochasticity and large stepsizes on the implicit regularisation of gradient descent (GD) and stochastic gradient descent (SGD) over diagonal linear networks. We prove the convergence of GD and SGD with mac
Externí odkaz:
http://arxiv.org/abs/2302.08982
Understanding the implicit bias of training algorithms is of crucial importance in order to explain the success of overparametrised neural networks. In this paper, we study the dynamics of stochastic gradient descent over diagonal linear networks thr
Externí odkaz:
http://arxiv.org/abs/2106.09524
Constant step-size Stochastic Gradient Descent exhibits two phases: a transient phase during which iterates make fast progress towards the optimum, followed by a stationary phase during which iterates oscillate around the optimal point. In this paper
Externí odkaz:
http://arxiv.org/abs/2007.00534
Autor:
Pesme, Scott, Flammarion, Nicolas
We consider the robust linear regression problem in the online setting where we have access to the data in a streaming manner, one data point after the other. More specifically, for a true parameter $\theta^*$, we consider the corrupted Gaussian line
Externí odkaz:
http://arxiv.org/abs/2007.00399
Constant step-size Stochastic Gradient Descent exhibits two phases: a transient phase during which iterates make fast progress towards the optimum, followed by a stationary phase during which iterates oscillate around the optimal point. In this paper
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8067bee9a7ad2eadc7345c3403e22504
https://infoscience.epfl.ch/record/278802
https://infoscience.epfl.ch/record/278802