Zobrazeno 1 - 10
of 22
pro vyhledávání: '"Wu, Huijia"'
Deep neural networks suffer from catastrophic forgetting when continually learning new concepts. In this paper, we analyze this problem from a data imbalance point of view. We argue that the imbalance between old task and new task data contributes to
Externí odkaz:
http://arxiv.org/abs/2405.15157
In the realm of mimicking human deliberation, large language models (LLMs) show promising performance, thereby amplifying the importance of this research area. Deliberation is influenced by both logic and personality. However, previous studies predom
Externí odkaz:
http://arxiv.org/abs/2404.07084
The Mixture of Experts (MoE) for language models has been proven effective in augmenting the capacity of models by dynamically routing each input token to a specific subset of experts for processing. Despite the success, most existing methods face a
Externí odkaz:
http://arxiv.org/abs/2402.12656
Autor:
Wei, Wenhong1 (AUTHOR) weiwh@dgut.edu.cn, Wu, Huijia1 (AUTHOR), He, Ying2 (AUTHOR), Li, Qingxia3 (AUTHOR)
Publikováno v:
PLoS ONE. 4/26/2024, Vol. 19 Issue 4, p1-25. 25p.
Deep stacked RNNs are usually hard to train. Adding shortcut connections across different layers is a common way to ease the training of stacked networks. However, extra shortcuts make the recurrent step more complicated. To simply the stacked archit
Externí odkaz:
http://arxiv.org/abs/1701.00576
Autor:
Li, Tengteng, Ma, Fupeng, Hao, Yafeng, Wu, Huijia, Zhu, Pu, Li, Ziwei, Li, Fengchao, Yu, Jiangang, Liu, Meihong, Lei, Cheng, Liang, Ting
Publikováno v:
Crystals (2073-4352); Sep2024, Vol. 14 Issue 9, p802, 13p
In this paper, we empirically explore the effects of various kinds of skip connections in stacked bidirectional LSTMs for sequential tagging. We investigate three kinds of skip connections connecting to LSTM cells: (a) skip connections to the gates,
Externí odkaz:
http://arxiv.org/abs/1610.03167
Combinatory Category Grammar (CCG) supertagging is a task to assign lexical categories to each word in a sentence. Almost all previous methods use fixed context window sizes as input features. However, it is obvious that different tags usually rely o
Externí odkaz:
http://arxiv.org/abs/1610.02749
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.