Zobrazeno 1 - 10
of 87
pro vyhledávání: '"Alemi, Alexander"'
Autor:
Everett, Katie, Xiao, Lechao, Wortsman, Mitchell, Alemi, Alexander A., Novak, Roman, Liu, Peter J., Gur, Izzeddin, Sohl-Dickstein, Jascha, Kaelbling, Leslie Pack, Lee, Jaehoon, Pennington, Jeffrey
Robust and effective scaling of models from small to large width typically requires the precise adjustment of many algorithmic and architectural details, such as parameterization and optimizer choices. In this work, we propose a new perspective on pa
Externí odkaz:
http://arxiv.org/abs/2407.05872
State-of-the-art neural networks require extreme computational power to train. It is therefore natural to wonder whether they are optimally trained. Here we apply a recent advancement in stochastic thermodynamics which allows bounding the speed at wh
Externí odkaz:
http://arxiv.org/abs/2307.14653
Autor:
Alemi, Alexander A., Poole, Ben
Bayesian inference offers benefits over maximum likelihood, but it also comes with computational costs. Computing the posterior is typically intractable, as is marginalizing that posterior to form the posterior predictive distribution. In this paper,
Externí odkaz:
http://arxiv.org/abs/2307.07568
Autor:
Ruan, Yangjun, Singh, Saurabh, Morningstar, Warren, Alemi, Alexander A., Ioffe, Sergey, Fischer, Ian, Dillon, Joshua V.
Ensembling has proven to be a powerful technique for boosting model performance, uncertainty estimation, and robustness in supervised learning. Advances in self-supervised learning (SSL) enable leveraging large unlabeled corpora for state-of-the-art
Externí odkaz:
http://arxiv.org/abs/2211.09981
In this work we investigate and demonstrate benefits of a Bayesian approach to imitation learning from multiple sensor inputs, as applied to the task of opening office doors with a mobile manipulator. Augmenting policies with additional sensor inputs
Externí odkaz:
http://arxiv.org/abs/2202.07600
We study the adversarial robustness of information bottleneck models for classification. Previous works showed that the robustness of models trained with information bottlenecks can improve upon adversarial training. Our evaluation under a diverse ra
Externí odkaz:
http://arxiv.org/abs/2107.05712
Autor:
Stanton, Samuel, Izmailov, Pavel, Kirichenko, Polina, Alemi, Alexander A., Wilson, Andrew Gordon
Knowledge distillation is a popular technique for training a small student network to emulate a larger teacher model, such as an ensemble of networks. We show that while knowledge distillation can improve student generalization, it does not typically
Externí odkaz:
http://arxiv.org/abs/2106.05945
In discriminative settings such as regression and classification there are two random variables at play, the inputs X and the targets Y. Here, we demonstrate that the Variational Information Bottleneck can be viewed as a compromise between fully empi
Externí odkaz:
http://arxiv.org/abs/2011.08711
Publikováno v:
International Conference on Artificial Intelligence and Statistics, 8270-8298, (2022)
The Bayesian posterior minimizes the "inferential risk" which itself bounds the "predictive risk". This bound is tight when the likelihood and prior are well-specified. However since misspecification induces a gap, the Bayesian posterior predictive d
Externí odkaz:
http://arxiv.org/abs/2010.09629
Autor:
Morningstar, Warren R., Ham, Cusuh, Gallagher, Andrew G., Lakshminarayanan, Balaji, Alemi, Alexander A., Dillon, Joshua V.
Perhaps surprisingly, recent studies have shown probabilistic model likelihoods have poor specificity for out-of-distribution (OOD) detection and often assign higher likelihoods to OOD data than in-distribution data. To ameliorate this issue we propo
Externí odkaz:
http://arxiv.org/abs/2006.09273