Zobrazeno 1 - 10
of 53
pro vyhledávání: '"Tolstikhin, Ilya"'
Learning curves plot the expected error of a learning algorithm as a function of the number of labeled samples it receives from a target distribution. They are widely used as a measure of an algorithm's performance, but classic PAC learning theory ca
Externí odkaz:
http://arxiv.org/abs/2208.14615
We introduce a generalization to the lottery ticket hypothesis in which the notion of "sparsity" is relaxed by choosing an arbitrary basis in the space of parameters. We present evidence that the original results reported for the canonical basis cont
Externí odkaz:
http://arxiv.org/abs/2107.06825
Autor:
Tolstikhin, Ilya, Houlsby, Neil, Kolesnikov, Alexander, Beyer, Lucas, Zhai, Xiaohua, Unterthiner, Thomas, Yung, Jessica, Steiner, Andreas, Keysers, Daniel, Uszkoreit, Jakob, Lucic, Mario, Dosovitskiy, Alexey
Convolutional Neural Networks (CNNs) are the go-to model for computer vision. Recently, attention-based networks, such as the Vision Transformer, have also become popular. In this paper we show that while convolutions and attention are both sufficien
Externí odkaz:
http://arxiv.org/abs/2105.01601
Autor:
Maennel, Hartmut, Alabdulmohsin, Ibrahim, Tolstikhin, Ilya, Baldock, Robert J. N., Bousquet, Olivier, Gelly, Sylvain, Keysers, Daniel
We study deep neural networks (DNNs) trained on natural image data with entirely random labels. Despite its popularity in the literature, where it is often used to study memorization, generalization, and other phenomena, little is known about what DN
Externí odkaz:
http://arxiv.org/abs/2006.10455
We show experimentally that the accuracy of a trained neural network can be predicted surprisingly well by looking only at its weights, without evaluating it on input data. We motivate this task and introduce a formal setting for it. Even when using
Externí odkaz:
http://arxiv.org/abs/2002.11448
Autor:
Göpfert, Christina, Ben-David, Shai, Bousquet, Olivier, Gelly, Sylvain, Tolstikhin, Ilya, Urner, Ruth
Publikováno v:
Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:1500-1518, 2019
In semi-supervised classification, one is given access both to labeled and unlabeled data. As unlabeled data is typically cheaper to acquire than labeled data, this setup becomes advantageous as soon as one can exploit the unlabeled data in order to
Externí odkaz:
http://arxiv.org/abs/1905.11866
Publikováno v:
33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada
The estimation of an f-divergence between two probability distributions based on samples is a fundamental problem in statistics and machine learning. Most works study this problem under very weak assumptions, in which case it is provably hard. We con
Externí odkaz:
http://arxiv.org/abs/1905.11112
Autor:
Rojas-Carulla, Mateo, Tolstikhin, Ilya, Luque, Guillermo, Youngblut, Nicholas, Ley, Ruth, Schölkopf, Bernhard
We introduce GeNet, a method for shotgun metagenomic classification from raw DNA sequences that exploits the known hierarchical structure between labels for training. We provide a comparison with state-of-the-art methods Kraken and Centrifuge on data
Externí odkaz:
http://arxiv.org/abs/1901.11015
Autor:
Locatello, Francesco, Vincent, Damien, Tolstikhin, Ilya, Rätsch, Gunnar, Gelly, Sylvain, Schölkopf, Bernhard
A common assumption in causal modeling posits that the data is generated by a set of independent mechanisms, and algorithms should aim to recover this structure. Standard unsupervised learning, however, is often concerned with training a single model
Externí odkaz:
http://arxiv.org/abs/1804.11130
We study the role of latent space dimensionality in Wasserstein auto-encoders (WAEs). Through experimentation on synthetic and real datasets, we argue that random encoders should be preferred over deterministic encoders. We highlight the potential of
Externí odkaz:
http://arxiv.org/abs/1802.03761