Zobrazeno 1 - 10
of 55 387
pro vyhledávání: '"Source language"'
Autor:
Wang, Jiayi, Lu, Yao, Weber, Maurice, Ryabinin, Max, Chen, Yihong, Tang, Raphael, Stenetorp, Pontus
English, as a very high-resource language, enables the pretraining of high-quality large language models (LLMs). The same cannot be said for most other languages, as leading LLMs still underperform for non-English languages, likely due to a gap in th
Externí odkaz:
http://arxiv.org/abs/2410.23956
The recent explosion of high-quality language models has necessitated new methods for identifying AI-generated text. Watermarking is a leading solution and could prove to be an essential tool in the age of generative AI. Existing approaches embed wat
Externí odkaz:
http://arxiv.org/abs/2410.18861
In addressing the pivotal role of translating natural language queries into SQL commands, we propose a suite of compact, fine-tuned models and self-refine mechanisms to democratize data access and analysis for non-expert users, mitigating risks assoc
Externí odkaz:
http://arxiv.org/abs/2409.15985
Language agents perform complex tasks by using tools to execute each step precisely. However, most existing agents are based on proprietary models or designed to target specific tasks, such as mathematics or multi-hop question answering. We introduce
Externí odkaz:
http://arxiv.org/abs/2406.06469
Unstructured text in medical notes and dialogues contains rich information. Recent advancements in Large Language Models (LLMs) have demonstrated superior performance in question answering and summarization tasks on unstructured text data, outperform
Externí odkaz:
http://arxiv.org/abs/2405.16295
State-of-the-art large language models are sometimes distributed as open-source software but are also increasingly provided as a closed-source service. These closed-source large-language models typically see the widest usage by the public, however, t
Externí odkaz:
http://arxiv.org/abs/2405.13907
Large language models (LLMs) have shown great potential for the automatic generation of feedback in a wide range of computing contexts. However, concerns have been voiced around the privacy and ethical implications of sending student work to propriet
Externí odkaz:
http://arxiv.org/abs/2405.05253
Autor:
Kim, Seungone, Suk, Juyoung, Longpre, Shayne, Lin, Bill Yuchen, Shin, Jamin, Welleck, Sean, Neubig, Graham, Lee, Moontae, Lee, Kyungjae, Seo, Minjoon
Proprietary LMs such as GPT-4 are often employed to assess the quality of responses from various LMs. However, concerns including transparency, controllability, and affordability strongly motivate the development of open-source LMs specialized in eva
Externí odkaz:
http://arxiv.org/abs/2405.01535
This paper presents our system developed for the SemEval-2024 Task 1: Semantic Textual Relatedness (STR), on Track C: Cross-lingual. The task aims to detect semantic relatedness of two sentences in a given target language without access to direct sup
Externí odkaz:
http://arxiv.org/abs/2404.02570
Autor:
Li, Haoyang, Zhang, Jing, Liu, Hanbing, Fan, Ju, Zhang, Xiaokang, Zhu, Jun, Wei, Renjie, Pan, Hongyan, Li, Cuiping, Chen, Hong
Language models have shown promising performance on the task of translating natural language questions into SQL queries (Text-to-SQL). However, most of the state-of-the-art (SOTA) approaches rely on powerful yet closed-source large language models (L
Externí odkaz:
http://arxiv.org/abs/2402.16347