Zobrazeno 1 - 10
of 651
pro vyhledávání: '"Huang, Stephen"'
Autor:
Qu, Xingwei, Bai, Yuelin, Ma, Yinghao, Zhou, Ziya, Lo, Ka Man, Liu, Jiaheng, Yuan, Ruibin, Min, Lejun, Liu, Xueling, Zhang, Tianyu, Du, Xinrun, Guo, Shuyue, Liang, Yiming, Li, Yizhi, Wu, Shangda, Zhou, Junting, Zheng, Tianyu, Ma, Ziyang, Han, Fengze, Xue, Wei, Xia, Gus, Benetos, Emmanouil, Yue, Xiang, Lin, Chenghua, Tan, Xu, Huang, Stephen W., Fu, Jie, Zhang, Ge
In this paper, we explore the application of Large Language Models (LLMs) to the pre-training of music. While the prevalent use of MIDI in music modeling is well-established, our findings suggest that LLMs are inherently more compatible with ABC Nota
Externí odkaz:
http://arxiv.org/abs/2404.06393
Autor:
Yang, Chen, Li, Junzhuo, Niu, Xinyao, Du, Xinrun, Gao, Songyang, Zhang, Haoran, Chen, Zhaoliang, Qu, Xingwei, Yuan, Ruibin, Li, Yizhi, Liu, Jiaheng, Huang, Stephen W., Yue, Shawn, Fu, Jie, Zhang, Ge
Uncovering early-stage metrics that reflect final model performance is one core principle for large-scale pretraining. The existing scaling law demonstrates the power-law correlation between pretraining loss and training flops, which serves as an imp
Externí odkaz:
http://arxiv.org/abs/2404.01204
Autor:
Qu, Xingwei, Liang, Yiming, Wang, Yucheng, Zheng, Tianyu, Yue, Tommy, Ma, Lei, Huang, Stephen W., Zhang, Jiajun, Shi, Yinan, Lin, Chenghua, Fu, Jie, Zhang, Ge
It has long been assumed that the sheer number of parameters in large language models (LLMs) drives in-context learning (ICL) capabilities, enabling remarkable performance improvements by leveraging task-specific demonstrations. Challenging this hypo
Externí odkaz:
http://arxiv.org/abs/2403.04233
Autor:
Zhuang, Alex, Zhang, Ge, Zheng, Tianyu, Du, Xinrun, Wang, Junjie, Ren, Weiming, Huang, Stephen W., Fu, Jie, Yue, Xiang, Chen, Wenhu
Structured data sources, such as tables, graphs, and databases, are ubiquitous knowledge sources. Despite the demonstrated capabilities of large language models (LLMs) on plain text, their proficiency in interpreting and utilizing structured data rem
Externí odkaz:
http://arxiv.org/abs/2402.16671
Drawing upon the intuition that aligning different modalities to the same semantic embedding space would allow models to understand states and actions more easily, we propose a new perspective to the offline reinforcement learning (RL) challenge. Mor
Externí odkaz:
http://arxiv.org/abs/2402.12845
Autor:
Shao, Yujie, Yao, Xinrong, Qu, Xingwei, Lin, Chenghua, Wang, Shi, Huang, Stephen W., Zhang, Ge, Fu, Jie
Metaphor is a prominent linguistic device in human language and literature, as they add color, imagery, and emphasis to enhance effective communication. This paper introduces a large-scale high quality annotated Chinese Metaphor Corpus, which compris
Externí odkaz:
http://arxiv.org/abs/2402.13145
Autor:
LI, Yizhi, Zhang, Ge, Qu, Xingwei, Li, Jiali, Li, Zhaoqun, Wang, Zekun, Li, Hao, Yuan, Ruibin, Ma, Yinghao, Zhang, Kai, Zhou, Wangchunshu, Liang, Yiming, Zhang, Lei, Ma, Lei, Zhang, Jiajun, Li, Zuowen, Huang, Stephen W., Lin, Chenghua, Fu, Jie
The advancement of large language models (LLMs) has enhanced the ability to generalize across a wide range of unseen natural language processing (NLP) tasks through instruction-following. Yet, their effectiveness often diminishes in low-resource lang
Externí odkaz:
http://arxiv.org/abs/2402.13109
Autor:
Jin, Yonggang, Zhang, Ge, Zhao, Hao, Zheng, Tianyu, Guo, Jarvi, Xiang, Liuyu, Yue, Shawn, Huang, Stephen W., He, Zhaofeng, Fu, Jie
Developing a generalist agent is a longstanding objective in artificial intelligence. Previous efforts utilizing extensive offline datasets from various tasks demonstrate remarkable performance in multitasking scenarios within Reinforcement Learning.
Externí odkaz:
http://arxiv.org/abs/2402.04154
Autor:
Wang, Zekun Moore, Peng, Zhongyuan, Que, Haoran, Liu, Jiaheng, Zhou, Wangchunshu, Wu, Yuhan, Guo, Hongcheng, Gan, Ruitong, Ni, Zehao, Yang, Jian, Zhang, Man, Zhang, Zhaoxiang, Ouyang, Wanli, Xu, Ke, Huang, Stephen W., Fu, Jie, Peng, Junran
The advent of Large Language Models (LLMs) has paved the way for complex tasks such as role-playing, which enhances user interactions by enabling models to imitate various characters. However, the closed-source nature of state-of-the-art LLMs and the
Externí odkaz:
http://arxiv.org/abs/2310.00746
Despite all recent progress, it is still challenging to edit and manipulate natural images with modern generative models. When using Generative Adversarial Network (GAN), one major hurdle is in the inversion process mapping a real image to its corres
Externí odkaz:
http://arxiv.org/abs/2309.04907