Zobrazeno 1 - 10
of 121 575
pro vyhledávání: '"Chae IS"'
Autor:
Agarwal, Amit, Panda, Srikant, Charles, Angeline, Kumar, Bhargava, Patel, Hitesh, Pattnayak, Priyanranjan, Rafi, Taki Hasan, Kumar, Tejaswini, Chae, Dong-Kyu
Recent advancements in Vision-Language Models (VLMs) have enabled significant progress in complex video understanding tasks. However, their robustness to real-world manipulations remains underexplored, limiting their reliability in critical applicati
Externí odkaz:
http://arxiv.org/abs/2412.19794
Modern vision models excel at general purpose downstream tasks. It is unclear, however, how they may be used for personalized vision tasks, which are both fine-grained and data-scarce. Recent works have successfully applied synthetic data to general-
Externí odkaz:
http://arxiv.org/abs/2412.16156
Autor:
Kim, Namhyun, Han, Juntaek, Choi, Jinseok, Alkhateeb, Ahmed, Chae, Chan-Byoung, Park, Jeonghun
In this paper, we propose a precoding framework for frequency division duplex (FDD) integrated sensing and communication (ISAC) systems with multiple-input multiple-output (MIMO). Specifically, we aim to maximize ergodic sum spectral efficiency (SE)
Externí odkaz:
http://arxiv.org/abs/2412.12590
Recent studies show that pretrained vision models can boost performance in audio downstream tasks. To enhance the performance further, an additional pretraining stage with large scale audio data is typically required to infuse audio specific knowledg
Externí odkaz:
http://arxiv.org/abs/2412.05951
SoRA: Singular Value Decomposed Low-Rank Adaptation for Domain Generalizable Representation Learning
Domain generalization (DG) aims to adapt a model using one or multiple source domains to ensure robust performance in unseen target domains. Recently, Parameter-Efficient Fine-Tuning (PEFT) of foundation models has shown promising results in the cont
Externí odkaz:
http://arxiv.org/abs/2412.04077
Autor:
Wu, Tuo, Zhi, Kangda, Yao, Junteng, Lai, Xiazhi, Zheng, Jianchao, Niu, Hong, Elkashlan, Maged, Wong, Kai-Kit, Chae, Chan-Byoung, Ding, Zhiguo, Karagiannidis, George K., Debbah, Merouane, Yuen, Chau
Fluid antenna system (FAS) as a new version of reconfigurable antenna technologies promoting shape and position flexibility, has emerged as an exciting and possibly transformative technology for wireless communications systems. FAS represents any sof
Externí odkaz:
http://arxiv.org/abs/2412.03839
Despite significant advances in vision-language understanding, implementing image segmentation within multimodal architectures remains a fundamental challenge in modern artificial intelligence systems. Existing vision-language models, which primarily
Externí odkaz:
http://arxiv.org/abs/2412.02565
With the rapid advancement of diffusion-based generative models, portrait image animation has achieved remarkable results. However, it still faces challenges in temporally consistent video generation and fast sampling due to its iterative sampling na
Externí odkaz:
http://arxiv.org/abs/2412.01064
Recent advances in multimodal models have demonstrated impressive capabilities in object recognition and scene understanding. However, these models often struggle with precise spatial localization - a critical capability for real-world applications.
Externí odkaz:
http://arxiv.org/abs/2411.18270
Lyrics generation presents unique challenges, particularly in achieving precise syllable control while adhering to song form structures such as verses and choruses. Conventional line-by-line approaches often lead to unnatural phrasing, underscoring t
Externí odkaz:
http://arxiv.org/abs/2411.13100