Zobrazeno 1 - 10
of 33 819
pro vyhledávání: '"Chen Min"'
Autor:
Dong, Xin, Fu, Yonggan, Diao, Shizhe, Byeon, Wonmin, Chen, Zijia, Mahabaleshwarkar, Ameya Sunil, Liu, Shih-Yang, Van Keirsbilck, Matthijs, Chen, Min-Hung, Suhara, Yoshi, Lin, Yingyan, Kautz, Jan, Molchanov, Pavlo
We propose Hymba, a family of small language models featuring a hybrid-head parallel architecture that integrates transformer attention mechanisms with state space models (SSMs) for enhanced efficiency. Attention heads provide high-resolution recall,
Externí odkaz:
http://arxiv.org/abs/2411.13676
Autor:
Liu, Shih-Yang, Yang, Huck, Wang, Chien-Yi, Fung, Nai Chit, Yin, Hongxu, Sakr, Charbel, Muralidharan, Saurav, Cheng, Kwang-Ting, Kautz, Jan, Wang, Yu-Chiang Frank, Molchanov, Pavlo, Chen, Min-Hung
In this work, we re-formulate the model compression problem into the customized compensation problem: Given a compressed model, we aim to introduce residual low-rank paths to compensate for compression errors under customized requirements from users
Externí odkaz:
http://arxiv.org/abs/2410.21271
Autor:
Du, Linkang, Zhou, Xuanru, Chen, Min, Zhang, Chusong, Su, Zhou, Cheng, Peng, Chen, Jiming, Zhang, Zhikun
As the implementation of machine learning (ML) systems becomes more widespread, especially with the introduction of larger ML models, we perceive a spring demand for massive data. However, it inevitably causes infringement and misuse problems with th
Externí odkaz:
http://arxiv.org/abs/2410.16618
Reinforcement learning (RL) has emerged as a pivotal technique for fine-tuning large language models (LLMs) on specific tasks. However, prevailing RL fine-tuning methods predominantly rely on PPO and its variants. Though these algorithms are effectiv
Externí odkaz:
http://arxiv.org/abs/2410.06101
Autor:
Jin, Yuanzhe, Chen, Min
Adversarial attacks are major threats to the deployment of machine learning (ML) models in many applications. Testing ML models against such attacks is becoming an essential step for evaluating and improving ML models. In this paper, we report the de
Externí odkaz:
http://arxiv.org/abs/2410.05334
In developing machine learning (ML) models for text classification, one common challenge is that the collected data is often not ideally distributed, especially when new classes are introduced in response to changes of data and tasks. In this paper,
Externí odkaz:
http://arxiv.org/abs/2409.15848
Efficient communication can enhance the overall performance of collaborative multi-agent reinforcement learning. A common approach is to share observations through full communication, leading to significant communication overhead. Existing work attem
Externí odkaz:
http://arxiv.org/abs/2409.07127
There is a great need to accurately predict short-term precipitation, which has socioeconomic effects such as agriculture and disaster prevention. Recently, the forecasting models have employed multi-source data as the multi-modality input, thus impr
Externí odkaz:
http://arxiv.org/abs/2409.06732
Autor:
Faure, Gueter Josmy, Yeh, Jia-Fong, Chen, Min-Hung, Su, Hung-Ting, Lai, Shang-Hong, Hsu, Winston H.
Existing research often treats long-form videos as extended short videos, leading to several limitations: inadequate capture of long-range dependencies, inefficient processing of redundant information, and failure to extract high-level semantic conce
Externí odkaz:
http://arxiv.org/abs/2408.17443