Zobrazeno 1 - 10
of 1 503
pro vyhledávání: '"Kwok, A. T."'
Recent works show that assembling multiple off-the-shelf large language models (LLMs) can harness their complementary abilities. To achieve this, routing is a promising method, which learns a router to select the most suitable LLM for each query. How
Externí odkaz:
http://arxiv.org/abs/2409.19886
Autor:
Chen, Kai, Gou, Yunhao, Huang, Runhui, Liu, Zhili, Tan, Daxin, Xu, Jing, Wang, Chunwei, Zhu, Yi, Zeng, Yihan, Yang, Kuo, Wang, Dingdong, Xiang, Kun, Li, Haoyuan, Bai, Haoli, Han, Jianhua, Li, Xiaohui, Jin, Weike, Xie, Nian, Zhang, Yu, Kwok, James T., Zhao, Hengshuang, Liang, Xiaodan, Yeung, Dit-Yan, Chen, Xiao, Li, Zhenguo, Zhang, Wei, Liu, Qun, Yao, Jun, Hong, Lanqing, Hou, Lu, Xu, Hang
GPT-4o, an omni-modal model that enables vocal conversations with diverse emotions and tones, marks a milestone for omni-modal foundation models. However, empowering Large Language Models to perceive and generate images, texts, and speeches end-to-en
Externí odkaz:
http://arxiv.org/abs/2409.18042
Autor:
Chen, Weiyu, Kwok, James T.
Multi-task learning, which optimizes performance across multiple tasks, is inherently a multi-objective optimization problem. Various algorithms are developed to provide discrete trade-off solutions on the Pareto front. Recently, continuous Pareto fr
Externí odkaz:
http://arxiv.org/abs/2407.20734
Pre-training followed by fine-tuning is widely adopted among practitioners. The performance can be improved by "model soups"~\cite{wortsman2022model} via exploring various hyperparameter configurations.The Learned-Soup, a variant of model soups, sign
Externí odkaz:
http://arxiv.org/abs/2407.03641
Autor:
Yang, Hansi, Kwok, James T.
Distributed learning, which does not require gathering training data in a central location, has become increasingly important in the big-data era. In particular, random-walk-based decentralized algorithms are flexible in that they do not need a centr
Externí odkaz:
http://arxiv.org/abs/2406.13183
Mixup and its variants form a popular class of data augmentation techniques.Using a random sample pair, it generates a new sample by linear interpolation of the inputs and labels. However, generating only one single interpolation may limit its augmen
Externí odkaz:
http://arxiv.org/abs/2406.01417
Reinforcement Learning from Human Feedback (RLHF) has been commonly used to align the behaviors of Large Language Models (LLMs) with human preferences. Recently, a popular alternative is Direct Policy Optimization (DPO), which replaces an LLM-based r
Externí odkaz:
http://arxiv.org/abs/2405.21040
Autor:
Liu, Zhili, Gou, Yunhao, Chen, Kai, Hong, Lanqing, Gao, Jiahui, Mi, Fei, Zhang, Yu, Li, Zhenguo, Jiang, Xin, Liu, Qun, Kwok, James T.
As the capabilities of large language models (LLMs) have expanded dramatically, aligning these models with human values presents a significant challenge. Traditional alignment strategies rely heavily on human intervention, such as Supervised Fine-Tun
Externí odkaz:
http://arxiv.org/abs/2405.00557
Autor:
Gou, Yunhao, Chen, Kai, Liu, Zhili, Hong, Lanqing, Xu, Hang, Li, Zhenguo, Yeung, Dit-Yan, Kwok, James T., Zhang, Yu
Multimodal large language models (MLLMs) have shown impressive reasoning abilities. However, they are also more vulnerable to jailbreak attacks than their LLM predecessors. Although still capable of detecting the unsafe responses, we observe that saf
Externí odkaz:
http://arxiv.org/abs/2403.09572
Masked Autoencoder~(MAE) is a prevailing self-supervised learning method that achieves promising results in model pre-training. However, when the various downstream tasks have data distributions different from the pre-training data, the semantically
Externí odkaz:
http://arxiv.org/abs/2402.05382