Zobrazeno 1 - 10
of 21 840
pro vyhledávání: '"FFN"'
Large Language Models (LLMs) have stunningly advanced the field of machine translation, though their effectiveness within the financial domain remains largely underexplored. To probe this issue, we constructed a fine-grained Chinese-English parallel
Externí odkaz:
http://arxiv.org/abs/2406.18856
Large language models (LLMs) have demonstrated impressive capabilities in various tasks using the in-context learning (ICL) paradigm. However, their effectiveness is often compromised by inherent bias, leading to prompt brittleness, i.e., sensitivity
Externí odkaz:
http://arxiv.org/abs/2405.20612
Autoregressive Large Language Models (e.g., LLaMa, GPTs) are omnipresent achieving remarkable success in language understanding and generation. However, such impressive capability typically comes with a substantial model size, which presents signific
Externí odkaz:
http://arxiv.org/abs/2404.03865
Pre-trained language models have been proven to possess strong base capabilities, which not only excel in in-distribution language modeling but also show powerful abilities in out-of-distribution language modeling, transfer learning and few-shot lear
Externí odkaz:
http://arxiv.org/abs/2403.02436
In recent years, Transformer networks have shown remarkable performance in speech recognition tasks. However, their deployment poses challenges due to high computational and storage resource requirements. To address this issue, a lightweight model ca
Externí odkaz:
http://arxiv.org/abs/2404.19214
Time series analysis is vital for numerous applications, and transformers have become increasingly prominent in this domain. Leading methods customize the transformer architecture from NLP and CV, utilizing a patching technique to convert continuous
Externí odkaz:
http://arxiv.org/abs/2402.05830
Autor:
Fabozzi, Frank J.1 (AUTHOR) ffabozz1@jhu.edu, Fallahgoul, Hasan2 (AUTHOR) hasan.fallahgoul@monash.edu, Franstianto, Vincentius2 (AUTHOR) vincentius.franstianto@monash.edu, Loeper, Grégoire2 (AUTHOR) gregoire.loeper@monash.edu
Publikováno v:
Studies in Nonlinear Dynamics & Econometrics. Sep2024, p1. 26p. 6 Illustrations.
Vision Transformer(ViT) is now dominating many vision tasks. The drawback of quadratic complexity of its token-wise multi-head self-attention (MHSA), is extensively addressed via either token sparsification or dimension reduction (in spatial or chann
Externí odkaz:
http://arxiv.org/abs/2306.10875
Publikováno v:
Progress in Modern Biomedicine. 2022, Vol. 22 Issue 6, p1142-1146. 6p.
Estimated time of arrival (ETA) is one of the most important services in intelligent transportation systems and becomes a challenging spatial-temporal (ST) data mining task in recent years. Nowadays, deep learning based methods, specifically recurren
Externí odkaz:
http://arxiv.org/abs/2006.04077