Zobrazeno 1 - 10
of 108
pro vyhledávání: '"Tang, Duyu"'
Autor:
Zhang, Jiwen, Wu, Jihao, Teng, Yihua, Liao, Minghui, Xu, Nuo, Xiao, Xiao, Wei, Zhongyu, Tang, Duyu
Large language model (LLM) leads to a surge of autonomous GUI agents for smartphone, which completes a task triggered by natural language through predicting a sequence of actions of API. Even though the task highly relies on past actions and visual o
Externí odkaz:
http://arxiv.org/abs/2403.02713
Autor:
Feng, Zhangyin, Hu, Runyi, Liu, Liangxin, Zhang, Fan, Tang, Duyu, Dai, Yong, Feng, Xiaocheng, Li, Jiwei, Qin, Bing, Shi, Shuming
Autoregressive and diffusion models drive the recent breakthroughs on text-to-image generation. Despite their huge success of generating high-realistic images, a common shortcoming of these models is their high inference latency - autoregressive mode
Externí odkaz:
http://arxiv.org/abs/2312.14988
Autor:
Feng, Zhangyin, Dai, Yong, Zhang, Fan, Tang, Duyu, Feng, Xiaocheng, Wu, Shuangzhi, Qin, Bing, Cao, Yunbo, Shi, Shuming
Traditional multitask learning methods basically can only exploit common knowledge in task- or language-wise, which lose either cross-language or cross-task knowledge. This paper proposes a general multilingual multitask model, named SkillNet-X, whic
Externí odkaz:
http://arxiv.org/abs/2306.16176
Autor:
Feng, Zhangyin, Ren, Yuchen, Yu, Xinmiao, Feng, Xiaocheng, Tang, Duyu, Shi, Shuming, Qin, Bing
Diffusion models developed on top of powerful text-to-image generation models like Stable Diffusion achieve remarkable success in visual story generation. However, the best-performing approach considers historically generated results as flattened mem
Externí odkaz:
http://arxiv.org/abs/2305.16811
Although large-scale video-language pre-training models, which usually build a global alignment between the video and the text, have achieved remarkable progress on various downstream tasks, the idea of adopting fine-grained information during the pr
Externí odkaz:
http://arxiv.org/abs/2302.09736
Autor:
Shi, Shuming, Zhao, Enbo, Tang, Duyu, Wang, Yan, Li, Piji, Bi, Wei, Jiang, Haiyun, Huang, Guoping, Cui, Leyang, Huang, Xinting, Zhou, Cong, Dai, Yong, Ma, Dongyang
In this technical report, we introduce Effidit (Efficient and Intelligent Editing), a digital writing assistant that facilitates users to write higher-quality text more efficiently by using artificial intelligence (AI) technologies. Previous writing
Externí odkaz:
http://arxiv.org/abs/2208.01815
One Model, Multiple Modalities: A Sparsely Activated Approach for Text, Sound, Image, Video and Code
Autor:
Dai, Yong, Tang, Duyu, Liu, Liangxin, Tan, Minghuan, Zhou, Cong, Wang, Jingquan, Feng, Zhangyin, Zhang, Fan, Hu, Xueyu, Shi, Shuming
People perceive the world with multiple senses (e.g., through hearing sounds, reading words and seeing objects). However, most existing AI systems only process an individual modality. This paper presents an approach that excels at handling multiple m
Externí odkaz:
http://arxiv.org/abs/2205.06126
We present SkillNet-NLG, a sparsely activated approach that handles many natural language generation tasks with one model. Different from traditional dense models that always activate all the parameters, SkillNet-NLG selectively activates relevant pa
Externí odkaz:
http://arxiv.org/abs/2204.12184
Chinese BERT models achieve remarkable progress in dealing with grammatical errors of word substitution. However, they fail to handle word insertion and deletion because BERT assumes the existence of a word at each position. To address this, we prese
Externí odkaz:
http://arxiv.org/abs/2204.12052
We present a Chinese BERT model dubbed MarkBERT that uses word information in this work. Existing word-based BERT models regard words as basic units, however, due to the vocabulary limit of BERT, they only cover high-frequency words and fall back to
Externí odkaz:
http://arxiv.org/abs/2203.06378