Zobrazeno 1 - 10
of 15 598
pro vyhledávání: '"Schütze, A."'
Autor:
Hofmann, Valentin, Weissweiler, Leonie, Mortensen, David, Schütze, Hinrich, Pierrehumbert, Janet
What mechanisms underlie linguistic generalization in large language models (LLMs)? This question has attracted considerable attention, with most studies analyzing the extent to which the language skills of LLMs resemble rules. As of yet, it is not k
Externí odkaz:
http://arxiv.org/abs/2411.07990
Autor:
He, Linyang, Nie, Ercong, Schmid, Helmut, Schütze, Hinrich, Mesgarani, Nima, Brennan, Jonathan
This study investigates the linguistic understanding of Large Language Models (LLMs) regarding signifier (form) and signified (meaning) by distinguishing two LLM evaluation paradigms: psycholinguistic and neurolinguistic. Traditional psycholinguistic
Externí odkaz:
http://arxiv.org/abs/2411.07533
The need for large text corpora has increased with the advent of pretrained language models and, in particular, the discovery of scaling laws for these models. Most available corpora have sufficient data only for languages with large dominant communi
Externí odkaz:
http://arxiv.org/abs/2410.23825
Autor:
Kargaran, Amir Hossein, Modarressi, Ali, Nikeghbal, Nafiseh, Diesner, Jana, Yvon, François, Schütze, Hinrich
English-centric large language models (LLMs) often show strong multilingual capabilities. However, the multilingual performance of these models remains unclear and is not thoroughly evaluated for many languages. Most benchmarks for multilinguality fo
Externí odkaz:
http://arxiv.org/abs/2410.05873
To ensure large language models contain up-to-date knowledge, they need to be updated regularly. However, model editing is challenging as it might also affect knowledge that is unrelated to the new data. State-of-the-art methods identify parameters a
Externí odkaz:
http://arxiv.org/abs/2410.02433
Autor:
Schütze, Paul, Abel, Aenne, Burkart, Florian, de Silva, L. Malinda S., Dinter, Hannes, Dojan, Kevin, Herkert, Adrian, Jaster-Merz, Sonja, Kellermeier, Max Joseph, Kuropka, Willi, Mayet, Frank, Daza, Sara Ruiz, Spannagel, Simon, Vinatier, Thomas, Wennlöf, Håkan
The electronCT technique is an imaging method based on the multiple Coulomb scattering of relativistic electrons and has potential applications in medical and industrial imaging. It utilizes a pencil beam of electrons in the very high energy electron
Externí odkaz:
http://arxiv.org/abs/2409.20091
Recent multilingual pretrained language models (mPLMs) often avoid using language embeddings -- learnable vectors assigned to different languages. These embeddings are discarded for two main reasons: (1) mPLMs are expected to have a single, unified p
Externí odkaz:
http://arxiv.org/abs/2409.18199
Autor:
Ji, Shaoxiong, Li, Zihao, Paul, Indraneil, Paavola, Jaakko, Lin, Peiqin, Chen, Pinzhen, O'Brien, Dayyán, Luo, Hengyu, Schütze, Hinrich, Tiedemann, Jörg, Haddow, Barry
In this work, we introduce EMMA-500, a large-scale multilingual language model continue-trained on texts across 546 languages designed for enhanced multilingual performance, focusing on improving language coverage for low-resource languages. To facil
Externí odkaz:
http://arxiv.org/abs/2409.17892
Autor:
Liu, Yihong, Wang, Mingyang, Kargaran, Amir Hossein, Imani, Ayyoob, Xhelili, Orgest, Ye, Haotian, Ma, Chunlan, Yvon, François, Schütze, Hinrich
Recent studies have shown that post-aligning multilingual pretrained language models (mPLMs) using alignment objectives on both original and transliterated data can improve crosslingual alignment. This improvement further leads to better crosslingual
Externí odkaz:
http://arxiv.org/abs/2409.17326
Autor:
Köksal, Abdullatif, Thaler, Marion, Imani, Ayyoob, Üstün, Ahmet, Korhonen, Anna, Schütze, Hinrich
Instruction tuning enhances large language models (LLMs) by aligning them with human preferences across diverse tasks. Traditional approaches to create instruction tuning datasets face serious challenges for low-resource languages due to their depend
Externí odkaz:
http://arxiv.org/abs/2409.12958