Zobrazeno 1 - 10
of 60
pro vyhledávání: '"Ash, Jordan"'
Autor:
Huang, Audrey, Block, Adam, Foster, Dylan J., Rohatgi, Dhruv, Zhang, Cyril, Simchowitz, Max, Ash, Jordan T., Krishnamurthy, Akshay
Recent work in language modeling has raised the possibility of self-improvement, where a language models evaluates and refines its own generations to achieve higher performance without external feedback. It is impossible for this self-improvement to
Externí odkaz:
http://arxiv.org/abs/2412.01951
Autor:
Juliani, Arthur, Ash, Jordan T.
Continual learning with deep neural networks presents challenges distinct from both the fixed-dataset and convex continual learning regimes. One such challenge is plasticity loss, wherein a neural network trained in an online fashion displays a degra
Externí odkaz:
http://arxiv.org/abs/2405.19153
Autor:
Bhatt, Gantavya, Chen, Yifang, Das, Arnav M., Zhang, Jifan, Truong, Sang T., Mussmann, Stephen, Zhu, Yinglun, Bilmes, Jeffrey, Du, Simon S., Jamieson, Kevin, Ash, Jordan T., Nowak, Robert D.
Supervised finetuning (SFT) on instruction datasets has played a crucial role in achieving the remarkable zero-shot generalization capabilities observed in modern large language models (LLMs). However, the annotation efforts required to produce high
Externí odkaz:
http://arxiv.org/abs/2401.06692
Transformer-based Large Language Models (LLMs) have become a fixture in modern machine learning. Correspondingly, significant resources are allocated towards research that aims to further advance this technology, typically resulting in models of incr
Externí odkaz:
http://arxiv.org/abs/2312.13558
Why do large language models sometimes output factual inaccuracies and exhibit erroneous reasoning? The brittleness of these models, particularly when executing long chains of reasoning, currently seems to be an inevitable price to pay for their adva
Externí odkaz:
http://arxiv.org/abs/2306.00946
Active learning is perhaps most naturally posed as an online learning problem. However, prior active learning approaches with deep neural networks assume offline access to the entire dataset ahead of time. This paper proposes VeSSAL, a new algorithm
Externí odkaz:
http://arxiv.org/abs/2303.02535
Models that can actively seek out the best quality training data hold the promise of more accurate, adaptable, and efficient machine learning. Active learning techniques often tend to prefer examples that are the most difficult to classify. While thi
Externí odkaz:
http://arxiv.org/abs/2211.00928
This work introduces the Eigen Memory Tree (EMT), a novel online memory model for sequential learning scenarios. EMTs store data at the leaves of a binary tree and route new samples through the structure using the principal components of previous exp
Externí odkaz:
http://arxiv.org/abs/2210.14077
Algorithmic reasoning requires capabilities which are most naturally understood through recurrent models of computation, like the Turing machine. However, Transformer models, while lacking recurrence, are able to perform such reasoning using far fewe
Externí odkaz:
http://arxiv.org/abs/2210.10749
Autor:
Saunshi, Nikunj, Ash, Jordan, Goel, Surbhi, Misra, Dipendra, Zhang, Cyril, Arora, Sanjeev, Kakade, Sham, Krishnamurthy, Akshay
Contrastive learning is a popular form of self-supervised learning that encourages augmentations (views) of the same input to have more similar representations compared to augmentations of different inputs. Recent attempts to theoretically explain th
Externí odkaz:
http://arxiv.org/abs/2202.14037