Zobrazeno 1 - 10
of 25
pro vyhledávání: '"Ni, Jinjie"'
Autor:
Ni, Jinjie, Xue, Fuzhao, Yue, Xiang, Deng, Yuntian, Shah, Mahir, Jain, Kabir, Neubig, Graham, You, Yang
Evaluating large language models (LLMs) is challenging. Traditional ground-truth-based benchmarks fail to capture the comprehensiveness and nuance of real-world queries, while LLM-as-judge benchmarks suffer from grading biases and limited query quant
Externí odkaz:
http://arxiv.org/abs/2406.06565
To help the open-source community have a better understanding of Mixture-of-Experts (MoE) based large language models (LLMs), we train and release OpenMoE, a series of fully open-sourced and reproducible decoder-only MoE LLMs, ranging from 650M to 34
Externí odkaz:
http://arxiv.org/abs/2402.01739
Semantic processing is a fundamental research domain in computational linguistics. In the era of powerful pre-trained language models and large language models, the advancement of research in this domain appears to be decelerating. However, the study
Externí odkaz:
http://arxiv.org/abs/2310.18345
Recent studies have revealed some issues of Multi-Head Attention (MHA), e.g., redundancy and over-parameterization. Specifically, the heads of MHA were originally designed to attend to information from different representation subspaces, whereas prio
Externí odkaz:
http://arxiv.org/abs/2305.14380
Logical reasoning is central to human cognition and intelligence. It includes deductive, inductive, and abductive reasoning. Past research of logical reasoning within AI uses formal language as knowledge representation and symbolic reasoners. However
Externí odkaz:
http://arxiv.org/abs/2303.12023
Autor:
Ni, Jinjie, Ma, Yukun, Wang, Wen, Chen, Qian, Ng, Dianwen, Lei, Han, Nguyen, Trung Hieu, Zhang, Chong, Ma, Bin, Cambria, Erik
Learning on a massive amount of speech corpus leads to the recent success of many self-supervised speech models. With knowledge distillation, these models may also benefit from the knowledge encoded by language models that are pre-trained on rich sou
Externí odkaz:
http://arxiv.org/abs/2303.03600
Autor:
Ng, Dianwen, Zhang, Ruixi, Yip, Jia Qi, Yang, Zhao, Ni, Jinjie, Zhang, Chong, Ma, Yukun, Ni, Chongjia, Chng, Eng Siong, Ma, Bin
Existing self-supervised pre-trained speech models have offered an effective way to leverage massive unannotated corpora to build good automatic speech recognition (ASR). However, many current models are trained on a clean corpus from a single source
Externí odkaz:
http://arxiv.org/abs/2302.14597
Graph Neural Networks (GNNs) are widely used for graph representation learning. Despite its prevalence, GNN suffers from two drawbacks in the graph classification task, the neglect of graph-level relationships, and the generalization issue. Each grap
Externí odkaz:
http://arxiv.org/abs/2209.00936
Publikováno v:
In Information Sciences September 2024 679
Publikováno v:
Published at AAAI 2022
The goal of building intelligent dialogue systems has largely been separately pursued under two paradigms: task-oriented dialogue (TOD) systems, which perform goal-oriented functions, and open-domain dialogue (ODD) systems, which focus on non-goal-or
Externí odkaz:
http://arxiv.org/abs/2109.04137