Zobrazeno 1 - 10
of 393
pro vyhledávání: '"Kolter J"'
Publikováno v:
NeurIPS 2024
Despite their strong performances on many generative tasks, diffusion models require a large number of sampling steps in order to generate realistic samples. This has motivated the community to develop effective methods to distill pre-trained diffusi
Externí odkaz:
http://arxiv.org/abs/2410.16794
Counterfactual explanations have been a popular method of post-hoc explainability for a variety of settings in Machine Learning. Such methods focus on explaining classifiers by generating new data points that are similar to a given reference, while r
Externí odkaz:
http://arxiv.org/abs/2410.14522
The composition of pretraining data is a key determinant of foundation models' performance, but there is no standard guideline for allocating a limited computational budget across different data sources. Most current approaches either rely on extensi
Externí odkaz:
http://arxiv.org/abs/2410.11820
Recent work has shown that state space models such as Mamba are significantly worse than Transformers on recall-based tasks due to the fact that their state size is constant with respect to their input sequence length. But in practice, state space mo
Externí odkaz:
http://arxiv.org/abs/2410.11135
A standard practice when using large language models is for users to supplement their instruction with an input context containing new information for the model to process. However, models struggle to reliably follow the input context, especially whe
Externí odkaz:
http://arxiv.org/abs/2410.10796
Vision-language models (VLMs) such as CLIP are trained via contrastive learning between text and image pairs, resulting in aligned image and text embeddings that are useful for many downstream tasks. A notable drawback of CLIP, however, is that the r
Externí odkaz:
http://arxiv.org/abs/2409.09721
Transformer architectures have become a dominant paradigm for domains like language modeling but suffer in many inference settings due to their quadratic-time self-attention. Recently proposed subquadratic architectures, such as Mamba, have shown pro
Externí odkaz:
http://arxiv.org/abs/2408.10189
Recovering natural language prompts for image generation models, solely based on the generated images is a difficult discrete optimization problem. In this work, we present the first head-to-head comparison of recent discrete optimization techniques
Externí odkaz:
http://arxiv.org/abs/2408.06502
The widespread use of large language models has resulted in a multitude of tokenizers and embedding spaces, making knowledge transfer in prompt discovery tasks difficult. In this work, we propose FUSE (Flexible Unification of Semantic Embeddings), an
Externí odkaz:
http://arxiv.org/abs/2408.04816
Consistency models (CMs) offer faster sampling than traditional diffusion models, but their training is resource-intensive. For example, as of 2024, training a state-of-the-art CM on CIFAR-10 takes one week on 8 GPUs. In this work, we propose an effe
Externí odkaz:
http://arxiv.org/abs/2406.14548