Zobrazeno 1 - 10
of 261
pro vyhledávání: '"Mozer, Michael C."'
We explore the training dynamics of neural networks in a structured non-IID setting where documents are presented cyclically in a fixed, repeated sequence. Typically, networks suffer from catastrophic interference when training on a sequence of docum
Externí odkaz:
http://arxiv.org/abs/2403.09613
Deep-learning models can extract a rich assortment of features from data. Which features a model uses depends not only on \emph{predictivity} -- how reliably a feature indicates training-set labels -- but also on \emph{availability} -- how easily the
Externí odkaz:
http://arxiv.org/abs/2310.16228
Autor:
Maini, Pratyush, Mozer, Michael C., Sedghi, Hanie, Lipton, Zachary C., Kolter, J. Zico, Zhang, Chiyuan
Recent efforts at explaining the interplay of memorization and generalization in deep overparametrized networks have posited that neural networks $\textit{memorize}$ "hard" examples in the final few layers of the model. Memorization refers to the abi
Externí odkaz:
http://arxiv.org/abs/2307.09542
The aim of object-centric vision is to construct an explicit representation of the objects in a scene. This representation is obtained via a set of interchangeable modules called \emph{slots} or \emph{object files} that compete for local patches of a
Externí odkaz:
http://arxiv.org/abs/2305.19550
Recent works demonstrate that early layers in a neural network contain useful information for prediction. Inspired by this, we show that extending temperature scaling across all layers improves both calibration and accuracy. We call this procedure "l
Externí odkaz:
http://arxiv.org/abs/2211.10193
Recent research in clustering face embeddings has found that unsupervised, shallow, heuristic-based methods -- including $k$-means and hierarchical agglomerative clustering -- underperform supervised, deep, inductive methods. While the reported impro
Externí odkaz:
http://arxiv.org/abs/2211.05183
Autor:
Elsayed, Gamaleldin F., Mahendran, Aravindh, van Steenkiste, Sjoerd, Greff, Klaus, Mozer, Michael C., Kipf, Thomas
The visual world can be parsimoniously characterized in terms of distinct entities with sparse interactions. Discovering this compositional structure in dynamic visual scenes has proven challenging for end-to-end computer vision approaches unless exp
Externí odkaz:
http://arxiv.org/abs/2206.07764
Autor:
Sukumar, Shruthi, Ward, Adrian F., Elliott-Williams, Camden, Hakimi, Shabnam, Mozer, Michael C.
Individuals are often faced with temptations that can lead them astray from long-term goals. We're interested in developing interventions that steer individuals toward making good initial decisions and then maintaining those decisions over time. In t
Externí odkaz:
http://arxiv.org/abs/2203.05782
Publikováno v:
ICML 2022, Proceedings of the 39th International Conference on Machine Learning
Transfer-learning methods aim to improve performance in a data-scarce target domain using a model pretrained on a data-rich source domain. A cost-efficient strategy, linear probing, involves freezing the source model and training a new classification
Externí odkaz:
http://arxiv.org/abs/2201.03529
Real world learning scenarios involve a nonstationary distribution of classes with sequential dependencies among the samples, in contrast to the standard machine learning formulation of drawing samples independently from a fixed, typically uniform di
Externí odkaz:
http://arxiv.org/abs/2109.05675