Stability analysis of stochastic gradient descent for homogeneous neural networks and linear classifiers.

Autor: Paquin AL; Department of Computer Science and software engineering, Laval University, Pavillon Adrien-Pouliot 1065, av. de la Médecine, Quebec, G1V0A6, Quebec, Canada. Electronic address: alexandre.lemire-paquin.1@ulaval.ca., Chaib-Draa B; Department of Computer Science and software engineering, Laval University, Pavillon Adrien-Pouliot 1065, av. de la Médecine, Quebec, G1V0A6, Quebec, Canada. Electronic address: brahim.chaib-draa@ift.ulaval.ca., Giguère P; Department of Computer Science and software engineering, Laval University, Pavillon Adrien-Pouliot 1065, av. de la Médecine, Quebec, G1V0A6, Quebec, Canada. Electronic address: philippe.giguere@ift.ulaval.ca.
Jazyk: angličtina
Zdroj: Neural networks : the official journal of the International Neural Network Society [Neural Netw] 2023 Jul; Vol. 164, pp. 382-394. Date of Electronic Publication: 2023 Apr 25.
DOI: 10.1016/j.neunet.2023.04.028
Abstrakt: We prove new generalization bounds for stochastic gradient descent when training classifiers with invariances. Our analysis is based on the stability framework and covers both the convex case of linear classifiers and the non-convex case of homogeneous neural networks. We analyze stability with respect to the normalized version of the loss function used for training. This leads to investigating a form of angle-wise stability instead of euclidean stability in weights. For neural networks, the measure of distance we consider is invariant to rescaling the weights of each layer. Furthermore, we exploit the notion of on-average stability in order to obtain a data-dependent quantity in the bound. This data-dependent quantity is seen to be more favorable when training with larger learning rates in our numerical experiments. This might help to shed some light on why larger learning rates can lead to better generalization in some practical scenarios.
Competing Interests: Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
(Copyright © 2023 Elsevier Ltd. All rights reserved.)
Databáze: MEDLINE