Výsledky vyhledávání - "Dahl, George E."

Report

Adaptive Gradient Methods at the Edge of Stability

Autor: Cohen, Jeremy M., Ghorbani, Behrooz, Krishnan, Shankar, Agarwal, Naman, Medapati, Sourabh, Badura, Michal, Suo, Daniel, Cardoze, David, Nado, Zachary, Dahl, George E., Gilmer, Justin

Very little is known about the training dynamics of adaptive gradient methods like Adam in deep learning. In this paper, we shed light on the behavior of these algorithms in the full-batch and sufficiently large batch settings. Specifically, we empir

Externí odkaz: http://arxiv.org/abs/2207.14484

Zobrazit plný text záznamu

Report

Pre-training helps Bayesian optimization too

Autor: Wang, Zi, Dahl, George E., Swersky, Kevin, Lee, Chansoo, Mariet, Zelda, Nado, Zachary, Gilmer, Justin, Snoek, Jasper, Ghahramani, Zoubin

Bayesian optimization (BO) has become a popular strategy for global optimization of many expensive real-world functions. Contrary to a common belief that BO is suited to optimizing black-box functions, it actually requires domain knowledge on charact

Externí odkaz: http://arxiv.org/abs/2207.03084

Zobrazit plný text záznamu

Report

AI system for fetal ultrasound in low-resource settings

Despite considerable progress in maternal healthcare, maternal and perinatal deaths remain high in low-to-middle income countries. Fetal ultrasound is an important component of antenatal care, but shortage of adequately trained healthcare workers has

Externí odkaz: http://arxiv.org/abs/2203.10139

Zobrazit plný text záznamu

Report

Predicting the utility of search spaces for black-box optimization: a simple, budget-aware approach

Autor: Ariafar, Setareh, Gilmer, Justin, Nado, Zachary, Snoek, Jasper, Jenatton, Rodolphe, Dahl, George E.

Black box optimization requires specifying a search space to explore for solutions, e.g. a d-dimensional compact space, and this choice is critical for getting the best results at a reasonable budget. Unfortunately, determining a high quality search

Externí odkaz: http://arxiv.org/abs/2112.08250

Zobrazit plný text záznamu

Report

Pre-trained Gaussian Processes for Bayesian Optimization

Autor: Wang, Zi, Dahl, George E., Swersky, Kevin, Lee, Chansoo, Nado, Zachary, Gilmer, Justin, Snoek, Jasper, Ghahramani, Zoubin

Publikováno v: Journal of Machine Learning Research, 25(212):1-83, 2024. URL http://jmlr.org/papers/v25/23-0269.html

Bayesian optimization (BO) has become a popular strategy for global optimization of expensive real-world functions. Contrary to a common expectation that BO is suited to optimizing black-box functions, it actually requires domain knowledge about thos

Externí odkaz: http://arxiv.org/abs/2109.08215

Zobrazit plný text záznamu

Report

What Will it Take to Fix Benchmarking in Natural Language Understanding?

Autor: Bowman, Samuel R., Dahl, George E.

Evaluation for many natural language understanding (NLU) tasks is broken: Unreliable and biased systems score so highly on standard benchmarks that there is little room for researchers who develop better systems to demonstrate their improvements. The

Externí odkaz: http://arxiv.org/abs/2104.02145

Zobrazit plný text záznamu

Report

A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers Suffice Across Batch Sizes

Autor: Nado, Zachary, Gilmer, Justin M., Shallue, Christopher J., Anil, Rohan, Dahl, George E.

Recently the LARS and LAMB optimizers have been proposed for training neural networks faster using large batch sizes. LARS and LAMB add layer-wise normalization to the update rules of Heavy-ball momentum and Adam, respectively, and have become popula

Externí odkaz: http://arxiv.org/abs/2102.06356

Zobrazit plný text záznamu

Report

On Empirical Comparisons of Optimizers for Deep Learning

Autor: Choi, Dami, Shallue, Christopher J., Nado, Zachary, Lee, Jaehoon, Maddison, Chris J., Dahl, George E.

Selecting an optimizer is a central step in the contemporary deep learning pipeline. In this paper, we demonstrate the sensitivity of optimizer comparisons to the hyperparameter tuning protocol. Our findings suggest that the hyperparameter search spa

Externí odkaz: http://arxiv.org/abs/1910.05446

Zobrazit plný text záznamu

Report

Faster Neural Network Training with Data Echoing

Autor: Choi, Dami, Passos, Alexandre, Shallue, Christopher J., Dahl, George E.

In the twilight of Moore's law, GPUs and other specialized hardware accelerators have dramatically sped up neural network training. However, earlier stages of the training pipeline, such as disk I/O and data preprocessing, do not run on accelerators.

Externí odkaz: http://arxiv.org/abs/1907.05550

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání