Zobrazeno 1 - 10
of 65
pro vyhledávání: '"Dahl, George E."'
Autor:
Dahl, George E., Schneider, Frank, Nado, Zachary, Agarwal, Naman, Sastry, Chandramouli Shama, Hennig, Philipp, Medapati, Sourabh, Eschenhagen, Runa, Kasimbeg, Priya, Suo, Daniel, Bae, Juhan, Gilmer, Justin, Peirson, Abel L., Khan, Bilal, Anil, Rohan, Rabbat, Mike, Krishnan, Shankar, Snider, Daniel, Amid, Ehsan, Chen, Kongtao, Maddison, Chris J., Vasudev, Rakshith, Badura, Michal, Garg, Ankush, Mattson, Peter
Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g., better update rules, tuning protocols, learning rate sched
Externí odkaz:
http://arxiv.org/abs/2306.07179
Autor:
Cohen, Jeremy M., Ghorbani, Behrooz, Krishnan, Shankar, Agarwal, Naman, Medapati, Sourabh, Badura, Michal, Suo, Daniel, Cardoze, David, Nado, Zachary, Dahl, George E., Gilmer, Justin
Very little is known about the training dynamics of adaptive gradient methods like Adam in deep learning. In this paper, we shed light on the behavior of these algorithms in the full-batch and sufficiently large batch settings. Specifically, we empir
Externí odkaz:
http://arxiv.org/abs/2207.14484
Autor:
Wang, Zi, Dahl, George E., Swersky, Kevin, Lee, Chansoo, Mariet, Zelda, Nado, Zachary, Gilmer, Justin, Snoek, Jasper, Ghahramani, Zoubin
Bayesian optimization (BO) has become a popular strategy for global optimization of many expensive real-world functions. Contrary to a common belief that BO is suited to optimizing black-box functions, it actually requires domain knowledge on charact
Externí odkaz:
http://arxiv.org/abs/2207.03084
Autor:
Gomes, Ryan G., Vwalika, Bellington, Lee, Chace, Willis, Angelica, Sieniek, Marcin, Price, Joan T., Chen, Christina, Kasaro, Margaret P., Taylor, James A., Stringer, Elizabeth M., McKinney, Scott Mayer, Sindano, Ntazana, Dahl, George E., Goodnight III, William, Gilmer, Justin, Chi, Benjamin H., Lau, Charles, Spitz, Terry, Saensuksopa, T, Liu, Kris, Wong, Jonny, Pilgrim, Rory, Uddin, Akib, Corrado, Greg, Peng, Lily, Chou, Katherine, Tse, Daniel, Stringer, Jeffrey S. A., Shetty, Shravya
Despite considerable progress in maternal healthcare, maternal and perinatal deaths remain high in low-to-middle income countries. Fetal ultrasound is an important component of antenatal care, but shortage of adequately trained healthcare workers has
Externí odkaz:
http://arxiv.org/abs/2203.10139
Autor:
Ariafar, Setareh, Gilmer, Justin, Nado, Zachary, Snoek, Jasper, Jenatton, Rodolphe, Dahl, George E.
Black box optimization requires specifying a search space to explore for solutions, e.g. a d-dimensional compact space, and this choice is critical for getting the best results at a reasonable budget. Unfortunately, determining a high quality search
Externí odkaz:
http://arxiv.org/abs/2112.08250
Autor:
Wang, Zi, Dahl, George E., Swersky, Kevin, Lee, Chansoo, Nado, Zachary, Gilmer, Justin, Snoek, Jasper, Ghahramani, Zoubin
Publikováno v:
Journal of Machine Learning Research, 25(212):1-83, 2024. URL http://jmlr.org/papers/v25/23-0269.html
Bayesian optimization (BO) has become a popular strategy for global optimization of expensive real-world functions. Contrary to a common expectation that BO is suited to optimizing black-box functions, it actually requires domain knowledge about thos
Externí odkaz:
http://arxiv.org/abs/2109.08215
Autor:
Bowman, Samuel R., Dahl, George E.
Evaluation for many natural language understanding (NLU) tasks is broken: Unreliable and biased systems score so highly on standard benchmarks that there is little room for researchers who develop better systems to demonstrate their improvements. The
Externí odkaz:
http://arxiv.org/abs/2104.02145
Recently the LARS and LAMB optimizers have been proposed for training neural networks faster using large batch sizes. LARS and LAMB add layer-wise normalization to the update rules of Heavy-ball momentum and Adam, respectively, and have become popula
Externí odkaz:
http://arxiv.org/abs/2102.06356
Autor:
Choi, Dami, Shallue, Christopher J., Nado, Zachary, Lee, Jaehoon, Maddison, Chris J., Dahl, George E.
Selecting an optimizer is a central step in the contemporary deep learning pipeline. In this paper, we demonstrate the sensitivity of optimizer comparisons to the hyperparameter tuning protocol. Our findings suggest that the hyperparameter search spa
Externí odkaz:
http://arxiv.org/abs/1910.05446
In the twilight of Moore's law, GPUs and other specialized hardware accelerators have dramatically sped up neural network training. However, earlier stages of the training pipeline, such as disk I/O and data preprocessing, do not run on accelerators.
Externí odkaz:
http://arxiv.org/abs/1907.05550