A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non Separable Functions
Autor: | Pascal Bianchi, Olivier Fercoq |
---|---|
Přispěvatelé: | Laboratoire Traitement et Communication de l'Information (LTCI), Télécom ParisTech-Institut Mines-Télécom [Paris] (IMT)-Centre National de la Recherche Scientifique (CNRS), Signal, Statistique et Apprentissage (S2A), Institut Mines-Télécom [Paris] (IMT)-Télécom Paris-Institut Mines-Télécom [Paris] (IMT)-Télécom Paris, Orange/Telecom ParisTech think tank Phi-TAB |
Jazyk: | angličtina |
Rok vydání: | 2019 |
Předmět: |
Primal dual algorithm
021103 operations research MathematicsofComputing_NUMERICALANALYSIS 0211 other engineering and technologies 010103 numerical & computational mathematics 02 engineering and technology 01 natural sciences Theoretical Computer Science Dual (category theory) ComputingMethodologies_PATTERNRECOGNITION Optimization and Control (math.OC) Iterated function Convex optimization FOS: Mathematics Applied mathematics [MATH.MATH-OC]Mathematics [math]/Optimization and Control [math.OC] 0101 mathematics Coordinate descent Mathematics - Optimization and Control Software Mathematics |
Zdroj: | SIAM Journal on Optimization SIAM Journal on Optimization, Society for Industrial and Applied Mathematics, 2019, 29 (1), pp.100-134 |
ISSN: | 1052-6234 |
Popis: | This paper introduces a coordinate descent version of the V\~u-Condat algorithm. By coordinate descent, we mean that only a subset of the coordinates of the primal and dual iterates is updated at each iteration, the other coordinates being maintained to their past value. Our method allows us to solve optimization problems with a combination of differentiable functions, constraints as well as non-separable and non-differentiable regularizers. We show that the sequences generated by our algorithm converge to a saddle point of the problem at stake, for a wider range of parameter values than previous methods. In particular, the condition on the step-sizes depends on the coordinate-wise Lipschitz constant of the differentiable function's gradient, which is a major feature allowing classical coordinate descent to perform so well when it is applicable. We then prove a sublinear rate of convergence in general and a linear rate of convergence if the objective enjoys strong convexity properties. We illustrate the performances of the algorithm on a total-variation regularized least squares regression problem and on large scale support vector machine problems. Comment: 32 pages |
Databáze: | OpenAIRE |
Externí odkaz: |