Výsledky vyhledávání - "Goldfarb, Donald"

Report

Layer-wise Adaptive Step-Sizes for Stochastic First-Order Methods for Deep Learning

Autor: Bahamou, Achraf, Goldfarb, Donald

We propose a new per-layer adaptive step-size procedure for stochastic first-order optimization methods for minimizing empirical loss functions in deep learning, eliminating the need for the user to tune the learning rate (LR). The proposed approach

Externí odkaz: http://arxiv.org/abs/2305.13664

Zobrazit plný text záznamu

Report

A Mini-Block Fisher Method for Deep Neural Networks

Autor: Bahamou, Achraf, Goldfarb, Donald, Ren, Yi

Deep neural networks (DNNs) are currently predominantly trained using first-order methods. Some of these methods (e.g., Adam, AdaGrad, and RMSprop, and their variants) incorporate a small amount of curvature information by using a diagonal matrix to

Externí odkaz: http://arxiv.org/abs/2202.04124

Zobrazit plný text záznamu

Report

Tensor Normal Training for Deep Learning Models

Autor: Ren, Yi, Goldfarb, Donald

Despite the predominant use of first-order methods for training deep learning models, second-order methods, and in particular, natural gradient methods, remain of interest because of their potential for accelerating training through the use of curvat

Externí odkaz: http://arxiv.org/abs/2106.02925

Zobrazit plný text záznamu

Report

Kronecker-factored Quasi-Newton Methods for Deep Learning

Autor: Ren, Yi, Bahamou, Achraf, Goldfarb, Donald

Second-order methods have the capability of accelerating optimization by using much richer curvature information than first-order methods. However, most are impractical for deep learning, where the number of training parameters is huge. In Goldfarb e

Externí odkaz: http://arxiv.org/abs/2102.06737

Zobrazit plný text záznamu

Report

Practical Quasi-Newton Methods for Training Deep Neural Networks

Autor: Goldfarb, Donald, Ren, Yi, Bahamou, Achraf

We consider the development of practical stochastic quasi-Newton, and in particular Kronecker-factored block-diagonal BFGS and L-BFGS methods, for training deep neural networks (DNNs). In DNN training, the number of variables and components of the gr

Externí odkaz: http://arxiv.org/abs/2006.08877

Zobrazit plný text záznamu

Report

A Dynamic Sampling Adaptive-SGD Method for Machine Learning

Autor: Bahamou, Achraf, Goldfarb, Donald

We propose a stochastic optimization method for minimizing loss functions, expressed as an expected value, that adaptively controls the batch size used in the computation of gradient approximations and the step size used to move along such directions

Externí odkaz: http://arxiv.org/abs/1912.13357

Zobrazit plný text záznamu

Report

Efficient Subsampled Gauss-Newton and Natural Gradient Methods for Training Neural Networks

Autor: Ren, Yi, Goldfarb, Donald

We present practical Levenberg-Marquardt variants of Gauss-Newton and natural gradient methods for solving non-convex optimization problems that arise in training deep neural networks involving enormous numbers of variables and huge data sets. Our me

Externí odkaz: http://arxiv.org/abs/1906.02353

Zobrazit plný text záznamu

Report

Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models: Extension

Autor: Teng, Yunfei, Gao, Wenbo, Chalus, Francois, Choromanska, Anna, Goldfarb, Donald, Weller, Adrian

We consider distributed optimization under communication constraints for training deep learning models. We propose a new algorithm, whose parameter updates rely on two forces: a regular gradient step, and a corrective direction dictated by the curren

Externí odkaz: http://arxiv.org/abs/1905.10395

Zobrazit plný text záznamu

Report

Increasing Iterate Averaging for Solving Saddle-Point Problems

Autor: Gao, Yuan, Kroer, Christian, Goldfarb, Donald

Many problems in machine learning and game theory can be formulated as saddle-point problems, for which various first-order methods have been developed and proven efficient in practice. Under the general convex-concave assumption, most first-order me

Externí odkaz: http://arxiv.org/abs/1903.10646

Zobrazit plný text záznamu

Report

ADMM for Multiaffine Constrained Optimization

Autor: Gao, Wenbo, Goldfarb, Donald, Curtis, Frank E.

We expand the scope of the alternating direction method of multipliers (ADMM). Specifically, we show that ADMM, when employed to solve problems with multiaffine constraints that satisfy certain verifiable assumptions, converges to the set of constrai

Externí odkaz: http://arxiv.org/abs/1802.09592

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání