Zobrazeno 1 - 10
of 23
pro vyhledávání: '"Haochen, Jeff Z."'
Despite recent theoretical progress on the non-convex optimization of two-layer neural networks, it is still an open question whether gradient descent on neural networks without unnatural modifications can achieve better sample complexity than kernel
Externí odkaz:
http://arxiv.org/abs/2306.16361
Autor:
Zhang, Yuhui, Yasunaga, Michihiro, Zhou, Zhengping, HaoChen, Jeff Z., Zou, James, Liang, Percy, Yeung, Serena
Language models have been shown to exhibit positive scaling, where performance improves as models are scaled up in terms of size, compute, or data. In this work, we introduce NeQA, a dataset consisting of questions with negation in which language mod
Externí odkaz:
http://arxiv.org/abs/2305.17311
Autor:
Zhang, Yuhui, HaoChen, Jeff Z., Huang, Shih-Cheng, Wang, Kuan-Chieh, Zou, James, Yeung, Serena
Recent multi-modal contrastive learning models have demonstrated the ability to learn an embedding space suitable for building strong vision classifiers, by leveraging the rich information in large-scale image-caption datasets. Our work highlights a
Externí odkaz:
http://arxiv.org/abs/2302.04269
Autor:
HaoChen, Jeff Z., Ma, Tengyu
Understanding self-supervised learning is important but challenging. Previous theoretical works study the role of pretraining losses, and view neural networks as general black boxes. However, the recent work of Saunshi et al. argues that the model ar
Externí odkaz:
http://arxiv.org/abs/2211.14699
Contrastive learning is a highly effective method for learning representations from unlabeled data. Recent works show that contrastive representations can transfer across domains, leading to simple state-of-the-art algorithms for unsupervised domain
Externí odkaz:
http://arxiv.org/abs/2204.02683
Autor:
Shen, Kendrick, Jones, Robbie, Kumar, Ananya, Xie, Sang Michael, HaoChen, Jeff Z., Ma, Tengyu, Liang, Percy
We consider unsupervised domain adaptation (UDA), where labeled data from a source domain (e.g., photographs) and unlabeled data from a target domain (e.g., sketches) are used to learn a classifier for the target domain. Conventional UDA methods (e.g
Externí odkaz:
http://arxiv.org/abs/2204.00570
We propose a framework for online meta-optimization of parameters that govern optimization, called Amortized Proximal Optimization (APO). We first interpret various existing neural network optimizers as approximate stochastic proximal point methods w
Externí odkaz:
http://arxiv.org/abs/2203.00089
Self-supervised learning (SSL) is a scalable way to learn general visual representations since it learns without labels. However, large-scale unlabeled datasets in the wild often have long-tailed label distributions, where we know little about the be
Externí odkaz:
http://arxiv.org/abs/2110.05025
Recent works in self-supervised learning have advanced the state-of-the-art by relying on the contrastive learning paradigm, which learns representations by pushing positive pairs, or similar examples from the same class, closer together while keepin
Externí odkaz:
http://arxiv.org/abs/2106.04156
Recent works found that fine-tuning and joint training---two popular approaches for transfer learning---do not always improve accuracy on downstream tasks. First, we aim to understand more about when and why fine-tuning and joint training can be subo
Externí odkaz:
http://arxiv.org/abs/2011.01418