Zobrazeno 1 - 10
of 85
pro vyhledávání: '"Jun, Michael"'
Autor:
Shi, Hao-Jun Michael, Lee, Tsung-Hsien, Iwasaki, Shintaro, Gallego-Posada, Jose, Li, Zhijing, Rangadurai, Kaushik, Mudigere, Dheevatsa, Rabbat, Michael
Shampoo is an online and stochastic optimization algorithm belonging to the AdaGrad family of methods for training neural networks. It constructs a block-diagonal preconditioner where each block consists of a coarse Kronecker product approximation to
Externí odkaz:
http://arxiv.org/abs/2309.06497
A common approach for minimizing a smooth nonlinear function is to employ finite-difference approximations to the gradient. While this can be easily performed when no error is present within the function evaluations, when the function is noisy, the o
Externí odkaz:
http://arxiv.org/abs/2110.06380
The goal of this paper is to investigate an approach for derivative-free optimization that has not received sufficient attention in the literature and is yet one of the simplest to implement and parallelize. It consists of computing gradients of a sm
Externí odkaz:
http://arxiv.org/abs/2102.09762
This paper describes an extension of the BFGS and L-BFGS methods for the minimization of a nonlinear function subject to errors. This work is motivated by applications that contain computational noise, employ low-precision arithmetic, or are subject
Externí odkaz:
http://arxiv.org/abs/2010.04352
Modern deep learning-based recommendation systems exploit hundreds to thousands of different categorical features, each with millions of different categories ranging from clicks to posts. To respect the natural diversity within the categorical data,
Externí odkaz:
http://arxiv.org/abs/1909.02107
Autor:
Naumov, Maxim, Mudigere, Dheevatsa, Shi, Hao-Jun Michael, Huang, Jianyu, Sundaraman, Narayanan, Park, Jongsoo, Wang, Xiaodong, Gupta, Udit, Wu, Carole-Jean, Azzolini, Alisson G., Dzhulgakov, Dmytro, Mallevich, Andrey, Cherniavskii, Ilia, Lu, Yinghai, Krishnamoorthi, Raghuraman, Yu, Ansha, Kondratenko, Volodymyr, Pereira, Stephanie, Chen, Xianjie, Chen, Wenlin, Rao, Vijay, Jia, Bill, Xiong, Liang, Smelyanskiy, Misha
With the advent of deep learning, neural network-based recommendation models have emerged as an important tool for tackling personalization and recommendation tasks. These networks differ significantly from other deep learning networks due to their n
Externí odkaz:
http://arxiv.org/abs/1906.00091
Autor:
Bollapragada, Raghu, Mudigere, Dheevatsa, Nocedal, Jorge, Shi, Hao-Jun Michael, Tang, Ping Tak Peter
The standard L-BFGS method relies on gradient approximations that are not dominated by noise, so that search directions are descent directions, the line search is reliable, and quasi-Newton updating yields useful quadratic models of the objective fun
Externí odkaz:
http://arxiv.org/abs/1802.05374
This monograph presents a class of algorithms called coordinate descent algorithms for mathematicians, statisticians, and engineers outside the field of optimization. This particular class of algorithms has recently gained popularity due to their eff
Externí odkaz:
http://arxiv.org/abs/1610.00040
This letter is focused on quantized Compressed Sensing, assuming that Lasso is used for signal estimation. Leveraging recent work, we provide a framework to optimize the quantization function and show that the recovered signal converges to the actual
Externí odkaz:
http://arxiv.org/abs/1606.03055
We propose two practical non-convex approaches for learning near-isometric, linear embeddings of finite sets of data points. Given a set of training points $\mathcal{X}$, we consider the secant set $S(\mathcal{X})$ that consists of all pairwise diffe
Externí odkaz:
http://arxiv.org/abs/1601.00062