Zobrazeno 1 - 10
of 26
pro vyhledávání: '"Helcl, Jindrich"'
Autor:
Helcl, Jindřich, Kasner, Zdeněk, Dušek, Ondřej, Limisiewicz, Tomasz, Macháček, Dominik, Musil, Tomáš, Libovický, Jindřich
This paper presents teaching materials, particularly assignments and ideas for classroom activities, from a new course on large language models (LLMs) taught at Charles University. The assignments include experiments with LLM inference for weather re
Externí odkaz:
http://arxiv.org/abs/2407.19798
Autor:
Libovický, Jindřich, Helcl, Jindřich
We present three innovations in tokenization and subword segmentation. First, we propose to use unsupervised morphological analysis with Morfessor as pre-tokenization. Second, we present an algebraic method for obtaining subword embeddings grounded i
Externí odkaz:
http://arxiv.org/abs/2406.13560
Autor:
Popel, Martin, Poláková, Lucie, Novák, Michal, Helcl, Jindřich, Libovický, Jindřich, Straňák, Pavel, Krabač, Tomáš, Hlaváčová, Jaroslava, Anisimova, Mariia, Chlaňová, Tereza
We present Charles Translator, a machine translation system between Ukrainian and Czech, developed as part of a society-wide effort to mitigate the impact of the Russian-Ukrainian war on individuals and society. The system was developed in the spring
Externí odkaz:
http://arxiv.org/abs/2404.06964
Autor:
Helcl, Jindřich, Libovický, Jindřich
We present the Charles University system for the MRL~2023 Shared Task on Multi-lingual Multi-task Information Retrieval. The goal of the shared task was to develop systems for named entity recognition and question answering in several under-represent
Externí odkaz:
http://arxiv.org/abs/2310.16528
We present Charles University submissions to the WMT22 General Translation Shared Task on Czech-Ukrainian and Ukrainian-Czech machine translation. We present two constrained submissions based on block back-translation and tagged back-translation and
Externí odkaz:
http://arxiv.org/abs/2212.00486
Autor:
Helcl, Jindřich
We present a non-autoregressive system submission to the WMT 22 Efficient Translation Shared Task. Our system was used by Helcl et al. (2022) in an attempt to provide fair comparison between non-autoregressive and autoregressive models. This submissi
Externí odkaz:
http://arxiv.org/abs/2212.00477
Efficient machine translation models are commercially important as they can increase inference speeds, and reduce costs and carbon emissions. Recently, there has been much interest in non-autoregressive (NAR) models, which promise faster translation.
Externí odkaz:
http://arxiv.org/abs/2205.01966
Autor:
Haddow, Barry, Bawden, Rachel, Barone, Antonio Valerio Miceli, Helcl, Jindřich, Birch, Alexandra
We present a survey covering the state of the art in low-resource machine translation research. There are currently around 7000 languages spoken in the world and almost all language pairs lack significant resources for training machine translation mo
Externí odkaz:
http://arxiv.org/abs/2109.00486
Non-autoregressive (nAR) models for machine translation (MT) manifest superior decoding speed when compared to autoregressive (AR) models, at the expense of impaired fluency of their outputs. We improve the fluency of a nAR model with connectionist t
Externí odkaz:
http://arxiv.org/abs/2004.03227
We present our submission to the WMT19 Robustness Task. Our baseline system is the Charles University (CUNI) Transformer system trained for the WMT18 shared task on News Translation. Quantitative results show that the CUNI Transformer system is alrea
Externí odkaz:
http://arxiv.org/abs/1906.09246