A Coordinate Descent Primal-Dual Algorithm with Large Step Size and Possibly Non Separable Functions

Autor:	Pascal Bianchi, Olivier Fercoq
Přispěvatelé:	Laboratoire Traitement et Communication de l'Information (LTCI), Télécom ParisTech-Institut Mines-Télécom [Paris] (IMT)-Centre National de la Recherche Scientifique (CNRS), Signal, Statistique et Apprentissage (S2A), Institut Mines-Télécom [Paris] (IMT)-Télécom Paris-Institut Mines-Télécom [Paris] (IMT)-Télécom Paris, Orange/Telecom ParisTech think tank Phi-TAB
Jazyk:	angličtina
Rok vydání:	2019
Předmět:	Primal dual algorithm 021103 operations research MathematicsofComputing_NUMERICALANALYSIS 0211 other engineering and technologies 010103 numerical & computational mathematics 02 engineering and technology 01 natural sciences Theoretical Computer Science Dual (category theory) ComputingMethodologies_PATTERNRECOGNITION Optimization and Control (math.OC) Iterated function Convex optimization FOS: Mathematics Applied mathematics [MATH.MATH-OC]Mathematics [math]/Optimization and Control [math.OC] 0101 mathematics Coordinate descent Mathematics - Optimization and Control Software Mathematics
Zdroj:	SIAM Journal on Optimization SIAM Journal on Optimization, Society for Industrial and Applied Mathematics, 2019, 29 (1), pp.100-134
ISSN:	1052-6234
Popis:	This paper introduces a coordinate descent version of the V\~u-Condat algorithm. By coordinate descent, we mean that only a subset of the coordinates of the primal and dual iterates is updated at each iteration, the other coordinates being maintained to their past value. Our method allows us to solve optimization problems with a combination of differentiable functions, constraints as well as non-separable and non-differentiable regularizers. We show that the sequences generated by our algorithm converge to a saddle point of the problem at stake, for a wider range of parameter values than previous methods. In particular, the condition on the step-sizes depends on the coordinate-wise Lipschitz constant of the differentiable function's gradient, which is a major feature allowing classical coordinate descent to perform so well when it is applicable. We then prove a sublinear rate of convergence in general and a linear rate of convergence if the objective enjoys strong convexity properties. We illustrate the performances of the algorithm on a total-variation regularized least squares regression problem and on large scale support vector machine problems. Comment: 32 pages
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b5a944b87e6c977b50a4b16275c0a7f8 https://hal.archives-ouvertes.fr/hal-01497104 Zobrazit plný text záznamu