Zobrazeno 1 - 10
of 179
pro vyhledávání: '"Bolte, Jérôme"'
Motivated by the widespread use of approximate derivatives in machine learning and optimization, we study inexact subgradient methods with non-vanishing additive errors and step sizes. In the nonconvex semialgebraic setting, under boundedness assumpt
Externí odkaz:
http://arxiv.org/abs/2404.19517
In appropriate frameworks, automatic differentiation is transparent to the user at the cost of being a significant computational burden when the number of operations is large. For iterative algorithms, implicit differentiation alleviates this issue b
Externí odkaz:
http://arxiv.org/abs/2305.13768
We leverage path differentiability and a recent result on nonsmooth implicit differentiation calculus to give sufficient conditions ensuring that the solution to a monotone inclusion problem will be path differentiable, with formulas for computing it
Externí odkaz:
http://arxiv.org/abs/2212.07844
Using the notion of conservative gradient, we provide a simple model to estimate the computational costs of the backward and forward modes of algorithmic differentiation for a wide class of nonsmooth programs. The overhead complexity of the backward
Externí odkaz:
http://arxiv.org/abs/2206.01730
Differentiation along algorithms, i.e., piggyback propagation of derivatives, is now routinely used to differentiate iterative solvers in differentiable programming. Asymptotics is well understood for many smooth problems but the nondifferentiable ca
Externí odkaz:
http://arxiv.org/abs/2206.00457
Using jointly geometric and stochastic reformulations of nonconvex problems and exploiting a Monge-Kantorovich gradient system formulation with vanishing forces, we formally extend the simulated annealing method to a wide class of global optimization
Externí odkaz:
http://arxiv.org/abs/2204.01306
Risk minimization for nonsmooth nonconvex problems naturally leads to firstorder sampling or, by an abuse of terminology, to stochastic subgradient descent. We establish the convergence of this method in the path-differentiable case, and describe mor
Externí odkaz:
http://arxiv.org/abs/2202.13744
The Frank-Wolfe algorithm is a popular method for minimizing a smooth convex function $f$ over a compact convex set $\mathcal{C}$. While many convergence results have been derived in terms of function values, hardly nothing is known about the converg
Externí odkaz:
http://arxiv.org/abs/2202.08711
Publikováno v:
Advances in Neural Information Processing Systems, Dec 2021, Paris, France
In theory, the choice of ReLU(0) in [0, 1] for a neural network has a negligible influence both on backpropagation and training. Yet, in the real world, 32 bits default precision combined with the size of deep learning problems makes it a hyperparame
Externí odkaz:
http://arxiv.org/abs/2106.12915
Publikováno v:
Advances in Neural Information Processing Systems, Dec 2021, Online, France
In view of training increasingly complex learning architectures, we establish a nonsmooth implicit function theorem with an operational calculus. Our result applies to most practical problems (i.e., definable problems) provided that a nonsmooth form
Externí odkaz:
http://arxiv.org/abs/2106.04350