Enhancing Parameter-Free Frank Wolfe with an Extra Subproblem

Autor:	Li, B., Wang, L., Georgios Giannakis, Zhao, Z.
Rok vydání:	2021
Předmět:	FOS: Computer and information sciences Computer Science - Machine Learning Optimization and Control (math.OC) FOS: Mathematics General Medicine Mathematics - Optimization and Control Machine Learning (cs.LG)
Zdroj:	Scopus-Elsevier
ISSN:	2374-3468 2159-5399
DOI:	10.1609/aaai.v35i9.17012
Popis:	Aiming at convex optimization under structural constraints, this work introduces and analyzes a variant of the Frank Wolfe (FW) algorithm termed ExtraFW. The distinct feature of ExtraFW is the pair of gradients leveraged per iteration, thanks to which the decision variable is updated in a prediction-correction (PC) format. Relying on no problem dependent parameters in the step sizes, the convergence rate of ExtraFW for general convex problems is shown to be ${\cal O}(\frac{1}{k})$, which is optimal in the sense of matching the lower bound on the number of solved FW subproblems. However, the merit of ExtraFW is its faster rate ${\cal O}\big(\frac{1}{k^2} \big)$ on a class of machine learning problems. Compared with other parameter-free FW variants that have faster rates on the same problems, ExtraFW has improved rates and fine-grained analysis thanks to its PC update. Numerical tests on binary classification with different sparsity-promoting constraints demonstrate that the empirical performance of ExtraFW is significantly better than FW, and even faster than Nesterov's accelerated gradient on certain datasets. For matrix completion, ExtraFW enjoys smaller optimality gap, and lower rank than FW. Comment: To appear in AAAI 2021. arXiv admin note: text overlap with arXiv:2006.11116
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::79315d52b4b3fb02f129c4ef42939535 https://doi.org/10.1609/aaai.v35i9.17012 Zobrazit plný text záznamu