Zobrazeno 1 - 10
of 26
pro vyhledávání: '"Duan, Sufeng"'
Autor:
Chai, Linzheng, Liu, Shukai, Yang, Jian, Yin, Yuwei, Jin, Ke, Liu, Jiaheng, Sun, Tao, Zhang, Ge, Ren, Changyu, Guo, Hongcheng, Wang, Zekun, Wang, Boyang, Wu, Xianjie, Wang, Bing, Li, Tongliang, Yang, Liqun, Duan, Sufeng, Li, Zhoujun
Code large language models (LLMs) have shown remarkable advances in code understanding, completion, and generation tasks. Programming benchmarks, comprised of a selection of code challenges and corresponding test cases, serve as a standard to evaluat
Externí odkaz:
http://arxiv.org/abs/2406.07436
Being one of the IR-NAT (Iterative-refinemennt-based NAT) frameworks, the Conditional Masked Language Model (CMLM) adopts the mask-predict paradigm to re-predict the masked low-confidence tokens. However, CMLM suffers from the data distribution discr
Externí odkaz:
http://arxiv.org/abs/2402.09725
Publikováno v:
in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 3896-3907, 2023
Multi-choice Machine Reading Comprehension (MRC) is a major and challenging task for machines to answer questions according to provided options. Answers in multi-choice MRC cannot be directly extracted in the given passages, and essentially require m
Externí odkaz:
http://arxiv.org/abs/2310.18070
Autor:
Duan, Sufeng, Zhao, Hai
In this paper, we propose an explanation of representation for self-attention network (SAN) based neural sequence encoders, which regards the information captured by the model and the encoding of the model as graph structure and the generation of the
Externí odkaz:
http://arxiv.org/abs/2101.06397
Understanding human language is one of the key themes of artificial intelligence. For language representation, the capacity of effectively modeling the linguistic knowledge from the detail-riddled and lengthy texts and getting rid of the noises is es
Externí odkaz:
http://arxiv.org/abs/2012.13915
Neural machine translation (NMT) usually works in a seq2seq learning way by viewing either source or target sentence as a linear sequence of words, which can be regarded as a special case of graph, taking words in the sequence as nodes and relationsh
Externí odkaz:
http://arxiv.org/abs/2009.07489
Transformer hugely benefits from its key design of the multi-head self-attention network (SAN), which extracts information from various perspectives through transforming the given input into different subspaces. However, its simple linear transformat
Externí odkaz:
http://arxiv.org/abs/2004.14649
Data augmentation is an effective performance enhancement in neural machine translation (NMT) by generating additional bilingual data. In this paper, we propose a novel data augmentation enhancement strategy for neural machine translation. Different
Externí odkaz:
http://arxiv.org/abs/2004.14200
Autor:
Duan, Sufeng, Zhao, Hai
Taking greedy decoding algorithm as it should be, this work focuses on further strengthening the model itself for Chinese word segmentation (CWS), which results in an even more fast and more accurate CWS model. Our model consists of an attention only
Externí odkaz:
http://arxiv.org/abs/1910.14537
For machine reading comprehension, the capacity of effectively modeling the linguistic knowledge from the detail-riddled and lengthy passages and getting ride of the noises is essential to improve its performance. Traditional attentive models attend
Externí odkaz:
http://arxiv.org/abs/1908.05147