Zobrazeno 1 - 10
of 40 113
pro vyhledávání: '"Language Modeling"'
Autoregressive large language models (LLMs) pre-trained by next token prediction are inherently proficient in generative tasks. However, their performance on knowledge-driven tasks such as factual knowledge querying remains unsatisfactory. Knowledge
Externí odkaz:
http://arxiv.org/abs/2412.04948
Autor:
Tavanaei, Amir, Koo, Kee Kiat, Ceker, Hayreddin, Jiang, Shaobai, Li, Qi, Han, Julien, Bouyarmane, Karim
In this paper, we study the problem of generating structured objects that conform to a complex schema, with intricate dependencies between the different components (facets) of the object. The facets of the object (attributes, fields, columns, propert
Externí odkaz:
http://arxiv.org/abs/2411.19301
Large Language Models (LLMs) have shown remarkable adaptability across domains beyond text, specifically electrocardiograms (ECGs). More specifically, there is a growing body of work exploring the task of generating text from a multi-channeled ECG an
Externí odkaz:
http://arxiv.org/abs/2412.14373
Autor:
Sun, Yutao, Bao, Hangbo, Wang, Wenhui, Peng, Zhiliang, Dong, Li, Huang, Shaohan, Wang, Jianyong, Wei, Furu
Multimodal generative models require a unified approach to handle both discrete data (e.g., text and code) and continuous data (e.g., image, audio, video). In this work, we propose Latent Language Modeling (LatentLM), which seamlessly integrates cont
Externí odkaz:
http://arxiv.org/abs/2412.08635
In the rapidly evolving financial sector, the accurate and timely interpretation of market news is essential for stakeholders needing to navigate unpredictable events. This paper introduces FANAL (Financial Activity News Alerting Language Modeling Fr
Externí odkaz:
http://arxiv.org/abs/2412.03527
The current large language models are mainly based on decode-only structure transformers, which have great in-context learning (ICL) capabilities. It is generally believed that the important foundation of its ICL capability is the induction heads mec
Externí odkaz:
http://arxiv.org/abs/2411.19574
Autor:
Datta, Akul
This paper reviews the development of the Receptance Weighted Key Value (RWKV) architecture, emphasizing its advancements in efficient language modeling. RWKV combines the training efficiency of Transformers with the inference efficiency of RNNs thro
Externí odkaz:
http://arxiv.org/abs/2411.02795
Transformer-based Large Language Models (LLMs) have exhibited remarkable success in various natural language processing tasks primarily attributed to self-attention mechanism, which requires a token to consider all preceding tokens as its context to
Externí odkaz:
http://arxiv.org/abs/2412.12465
Autor:
Park, Ji-jun, Choi, Soo-joon
Video captioning is a critical task in the field of multimodal machine learning, aiming to generate descriptive and coherent textual narratives for video content. While large vision-language models (LVLMs) have shown significant progress, they often
Externí odkaz:
http://arxiv.org/abs/2412.10720