Zobrazeno 1 - 10
of 435 396
pro vyhledávání: '"BRYAN, A. A."'
As of December 2024, the ARC-AGI benchmark is five years old and remains unbeaten. We believe it is currently the most important unsolved AI benchmark in the world because it seeks to measure generalization on novel tasks -- the essence of intelligen
Externí odkaz:
http://arxiv.org/abs/2412.04604
Multi-Source Domain Generalization (DG) is the task of training on multiple source domains and achieving high classification performance on unseen target domains. Recent methods combine robust features from web-scale pretrained backbones with new fea
Externí odkaz:
http://arxiv.org/abs/2412.02856
Autor:
Su, Dan, Kong, Kezhi, Lin, Ying, Jennings, Joseph, Norick, Brandon, Kliegl, Markus, Patwary, Mostofa, Shoeybi, Mohammad, Catanzaro, Bryan
Recent English Common Crawl datasets like FineWeb-Edu and DCLM achieved significant benchmark gains via aggressive model-based filtering, but at the cost of removing 90% of data. This limits their suitability for long token horizon training, such as
Externí odkaz:
http://arxiv.org/abs/2412.02595
Autor:
Abello, Hans Matthew, Badiola, Maxine Beatriz, Custer, Mark John, Fausto, Lorane Bernadeth, Leonida, Patrick Josh, Yongco, Denzel Bryan, Deja, Jordan Aiko
Publikováno v:
Proceedings of CHIRP 2024: Transforming HCI Research in the Philippines Workshop
Push notifications are brief messages that users frequently encounter in their daily lives. However, the volume of notifications can lead to information overload, making it challenging for users to engage effectively. This study investigates how noti
Externí odkaz:
http://arxiv.org/abs/2412.00531
Despite inheriting security measures from underlying language models, Vision-Language Models (VLMs) may still be vulnerable to safety alignment issues. Through empirical analysis, we uncover two critical findings: scenario-matched images can signific
Externí odkaz:
http://arxiv.org/abs/2411.18000
Question answering represents a core capability of large language models (LLMs). However, when individuals encounter unfamiliar knowledge in texts, they often formulate questions that the text itself cannot answer due to insufficient understanding of
Externí odkaz:
http://arxiv.org/abs/2411.17993
Autor:
Chen, Ziyang, Seetharaman, Prem, Russell, Bryan, Nieto, Oriol, Bourgin, David, Owens, Andrew, Salamon, Justin
Generating sound effects for videos often requires creating artistic sound effects that diverge significantly from real-life sources and flexible control in the sound design. To address this problem, we introduce MultiFoley, a model designed for vide
Externí odkaz:
http://arxiv.org/abs/2411.17698
Diffusion models have achieved impressive results in generative tasks like text-to-image (T2I) and text-to-video (T2V) synthesis. However, achieving accurate text alignment in T2V generation remains challenging due to the complex temporal dependency
Externí odkaz:
http://arxiv.org/abs/2411.17041
Autor:
Tasnim, Nazia, Plummer, Bryan A.
Incremental learning aims to adapt to new sets of categories over time with minimal computational overhead. Prior work often addresses this task by training efficient task-specific adaptors that modify frozen layer weights or features to capture rele
Externí odkaz:
http://arxiv.org/abs/2411.16870
Modern sequence models (e.g., Transformers, linear RNNs, etc.) emerged as dominant backbones of recent deep learning frameworks, mainly due to their efficiency, representational power, and/or ability to capture long-range dependencies. Adopting these
Externí odkaz:
http://arxiv.org/abs/2411.15671