Zobrazeno 1 - 10
of 6 248
pro vyhledávání: '"Yan,Ning"'
Large language models (LLMs) pre-trained on massive corpora have demonstrated impressive few-shot learning capability on many NLP tasks. Recasting an NLP task into a text-to-text generation task is a common practice so that generative LLMs can be pro
Externí odkaz:
http://arxiv.org/abs/2411.02864
Transformers have a quadratic scaling of computational complexity with input size, which limits the input context window size of large language models (LLMs) in both training and inference. Meanwhile, retrieval-augmented generation (RAG) besed models
Externí odkaz:
http://arxiv.org/abs/2410.12859
Autor:
Wang, Shiao, Wang, Yifeng, Ma, Qingchuan, Wang, Xiao, Yan, Ning, Yang, Qingquan, Xu, Guosheng, Tang, Jin
Q-distribution prediction is a crucial research direction in controlled nuclear fusion, with deep learning emerging as a key approach to solving prediction challenges. In this paper, we leverage deep learning techniques to tackle the complexities of
Externí odkaz:
http://arxiv.org/abs/2410.08879
Cyclostationary linear inverse models (CS-LIMs), generalized versions of the classical (stationary) LIM, are advanced data-driven techniques for extracting the first-order time-dependent dynamics and random forcing relevant information from complex n
Externí odkaz:
http://arxiv.org/abs/2407.10931
Modern vision transformers leverage visually inspired local interaction between pixels through attention computed within window or grid regions, in contrast to the global attention employed in the original ViT. Regional attention restricts pixel inte
Externí odkaz:
http://arxiv.org/abs/2406.08859
Transformers have elevated to the state-of-the-art vision architectures through innovations in attention mechanism inspired from visual perception. At present two classes of attentions prevail in vision transformers, regional and sparse attention. Th
Externí odkaz:
http://arxiv.org/abs/2403.04200
In real-world problems, environmental noise is often idealized as Gaussian white noise, despite potential temporal dependencies. The Linear Inverse Model (LIM) is a class of data-driven methods that extract dynamic and stochastic information from fin
Externí odkaz:
http://arxiv.org/abs/2402.15184
Autor:
Sushil, Madhumita, Zack, Travis, Mandair, Divneet, Zheng, Zhiwei, Wali, Ahmed, Yu, Yan-Ning, Quan, Yuwei, Butte, Atul J.
Although supervised machine learning is popular for information extraction from clinical notes, creating large annotated datasets requires extensive domain expertise and is time-consuming. Meanwhile, large language models (LLMs) have demonstrated pro
Externí odkaz:
http://arxiv.org/abs/2401.13887
In this article, a \underline{S}tate-dependent \underline{M}ulti-\underline{A}gent \underline{D}eep \underline{D}eterministic \underline{P}olicy \underline{G}radient (\textbf{SMADDPG}) method is proposed in order to learn an optimal control policy fo
Externí odkaz:
http://arxiv.org/abs/2312.04767
Graphs have emerged as a natural choice to represent and analyze the intricate patterns and rich information of the Web, enabling applications such as online page classification and social recommendation. The prevailing "pre-train, fine-tune" paradig
Externí odkaz:
http://arxiv.org/abs/2310.15318