Zobrazeno 1 - 10
of 152
pro vyhledávání: '"Liu, Songhua"'
In this paper, we introduce OminiControl, a highly versatile and parameter-efficient framework that integrates image conditions into pre-trained Diffusion Transformer (DiT) models. At its core, OminiControl leverages a parameter reuse mechanism, enab
Externí odkaz:
http://arxiv.org/abs/2411.15098
Dataset distillation or condensation refers to compressing a large-scale dataset into a much smaller one, enabling models trained on this synthetic dataset to generalize effectively on real data. Tackling this challenge, as defined, relies on a bi-le
Externí odkaz:
http://arxiv.org/abs/2410.07579
Modern diffusion models, particularly those utilizing a Transformer-based UNet for denoising, rely heavily on self-attention operations to manage complex spatial relationships, thus achieving impressive generation performance. However, this existing
Externí odkaz:
http://arxiv.org/abs/2409.02097
Dataset distillation or condensation aims to condense a large-scale training dataset into a much smaller synthetic one such that the training performance of distilled and original sets on neural networks are similar. Although the number of training s
Externí odkaz:
http://arxiv.org/abs/2408.08201
Diffusion models have recently achieved remarkable results for video generation. Despite the encouraging performances, the generated videos are typically constrained to a small number of frames, resulting in clips lasting merely a few seconds. The pr
Externí odkaz:
http://arxiv.org/abs/2406.16260
Single domain generalization (single DG) aims at learning a robust model generalizable to unseen domains from only one training domain, making it a highly ambitious and challenging task. State-of-the-art approaches have mostly relied on data augmenta
Externí odkaz:
http://arxiv.org/abs/2406.00275
The proliferation of large-scale AI models trained on extensive datasets has revolutionized machine learning. With these models taking on increasingly central roles in various applications, the need to understand their behavior and enhance interpreta
Externí odkaz:
http://arxiv.org/abs/2404.14006
Brain decoding, a pivotal field in neuroscience, aims to reconstruct stimuli from acquired brain signals, primarily utilizing functional magnetic resonance imaging (fMRI). Currently, brain decoding is confined to a per-subject-per-model paradigm, lim
Externí odkaz:
http://arxiv.org/abs/2404.07850
Adversarial attacks constitute a notable threat to machine learning systems, given their potential to induce erroneous predictions and classifications. However, within real-world contexts, the essential specifics of the deployed model are frequently
Externí odkaz:
http://arxiv.org/abs/2312.12768
Vision Transformer has demonstrated impressive success across various vision tasks. However, its heavy computation cost, which grows quadratically with respect to the token sequence length, largely limits its power in handling large feature maps. To
Externí odkaz:
http://arxiv.org/abs/2308.12216