Zobrazeno 1 - 10
of 12
pro vyhledávání: '"Alberto Poncelas"'
Rapid Development of Competitive Translation Engines for Access to Multilingual COVID-19 Information
Publikováno v:
Informatics, Vol 7, Iss 2, p 19 (2020)
Every day, more people are becoming infected and dying from exposure to COVID-19. Some countries in Europe like Spain, France, the UK and Italy have suffered particularly badly from the virus. Others such as Germany appear to have coped extremely wel
Externí odkaz:
https://doaj.org/article/0b5e0b6b2e174698844011aa12919538
Publikováno v:
Natural Language Engineering. 28:71-91
In machine-learning applications, data selection is of crucial importance if good runtime performance is to be achieved. In a scenario where the test set is accessible when the model is being built, training instances can be selected so they are the
Publikováno v:
Prague Bulletin of Mathematical Linguistics, Vol 108, Iss 1, Pp 245-256 (2017)
The Prague Bulletin of Mathematical Linguistics
Poncelas, Alberto ORCID: 0000-0002-5089-1687, Maillette de Buy Wenniger, Gideon and Way, Andy ORCID: 0000-0001-5736-5930 (2017) Applying N-gram alignment entropy to improve feature decay algorithms. The Prague Bulletin of Mathematical Linguistics (108). pp. 245-256. ISSN 0032-6585
The Prague Bulletin of Mathematical Linguistics
Poncelas, Alberto ORCID: 0000-0002-5089-1687
Data Selection is a popular step in Machine Translation pipelines. Feature Decay Algorithms (FDA) is a technique for data selection that has shown a good performance in several tasks. FDA aims to maximize the coverage of n-grams in the test set. Howe
Publikováno v:
Soto, Xabier ORCID: 0000-0002-3622-6496 , Shterionov, Dimitar ORCID: 0000-0001-6300-797X , Poncelas, Alberto ORCID: 0000-0002-5089-1687 and Way, Andy ORCID: 0000-0001-5736-5930 (2020) Selecting Backtranslated Data from Multiple Sources for Improved Neural Machine Translation. In: Annual Conference of the Association for Computational Linguistics, ACL, 5-10 July 2020, Seattle, WA, USA (Online).
ACL
ACL
Machine translation (MT) has benefited from using synthetic training data originating from translating monolingual corpora, a technique known as backtranslation. Combining backtranslated data from different sources has led to better results than when
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::e69f24d712cc76577a0c7139169f428c
Autor:
Gideon Maillette de Buy Wenniger, Andy Way, Dimitar Shterionov, Maja Popović, Alberto Poncelas
Publikováno v:
Poncelas, Alberto ORCID: 0000-0002-5089-1687 , Popović, Maja ORCID: 0000-0001-8234-8745 , Shterionov, Dimitar ORCID: 0000-0001-6300-797X , Maillette de Buy Wenniger, Gideon and Way, Andy ORCID: 0000-0001-5736-5930 (2019) Combining SMT and NMT back-translated data for efficient NMT. In: Recent Advances in Natural Language Processing (RANLP 2019), 2-4 Sept 2019, Varna, Bulgaria.
Scopus-Elsevier
Tilburg University-PURE
RANLP
Scopus-Elsevier
Tilburg University-PURE
RANLP
Neural Machine Translation (NMT) models achieve their best performance when large sets of parallel data are used for training. Consequently, techniques for augmenting the training set have become popular recently. One of these methods is back-transla
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::4969e950485870897dfabc3ba1475bcb
http://doras.dcu.ie/24272/
http://doras.dcu.ie/24272/
Publikováno v:
Poncelas, Alberto ORCID: 0000-0002-5089-1687 , Maillette de Buy Wenniger, Gideon ORCID: 0000-0001-8427-7055 and Way, Andy ORCID: 0000-0001-5736-5930 (2019) Adaptation of machine translation models with back-translated data using transductive data selection methods. In: A Proceedings of CICLing 2019, the 20th International Conference on Computational Linguistics and Intelligent Text Processing, 7-13 Apr 2019, La Rochelle, France.
Computational Linguistics and Intelligent Text Processing ISBN: 9783031243363
Computational Linguistics and Intelligent Text Processing ISBN: 9783031243363
Data selection has proven its merit for improving Neural Machine Translation (NMT), when applied to authentic data. But the benefit of using synthetic data in NMT training, produced by the popular back-translation technique, raises the question if da
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c240afb91faa68824a6b000a56e29cb7
http://doras.dcu.ie/23870/
http://doras.dcu.ie/23870/
Publikováno v:
Cruz Silva, Catarina, Liu, Chao-Hong ORCID: 0000-0002-1235-6026 , Poncelas, Alberto ORCID: 0000-0002-5089-1687 and Way, Andy ORCID: 0000-0001-5736-5930 (2018) Extracting in-domain training corpora for neural machine translation using data selection methods. In: Third Conference on Machine Translation (WMT), 31 Oct-1 Nov 2018, Belgium, Brussels.
WMT
WMT
Data selection is a process used in selecting a subset of parallel data for the training of machine translation (MT) systems, so that 1) resources for training might be reduced, 2) trained models could perform better than those trained with the whole
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::398a2b747a6d059ab7ded099a78abb47
http://doras.dcu.ie/23338/
http://doras.dcu.ie/23338/
Publikováno v:
RUA. Repositorio Institucional de la Universidad de Alicante
Universidad de Alicante (UA)
Scopus-Elsevier
Poncelas, Alberto ORCID: 0000-0002-5089-1687, Maillette de Buy Wenniger, Gideon and Way, Andy ORCID: 0000-0001-5736-5930 (2018) Feature decay algorithms for neural machine translation. In: 21st Annual Conference of The European Association for Machine Translation, 28-30 May 2018, Alicante, Spain.
Universidad de Alicante (UA)
Scopus-Elsevier
Poncelas, Alberto ORCID: 0000-0002-5089-1687
Neural Machine Translation (NMT) systems require a lot of data to be competitive. For this reason, data selection techniques are used only for fine-tuning systems that have been trained with larger amounts of data. In this work we aim to use Feature
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::3bf11f83f62012c87a175e37cef99773
http://hdl.handle.net/10045/76084
http://hdl.handle.net/10045/76084
Autor:
Alberto Poncelas, Dimitar Shterionov, Andy Way, Gideon Maillette De Buy Wenniger, Peyman Passban
Publikováno v:
RUA. Repositorio Institucional de la Universidad de Alicante
Universidad de Alicante (UA)
Scopus-Elsevier
Poncelas, Alberto ORCID: 0000-0002-5089-1687, Shterionov, Dimitar ORCID: 0000-0001-6300-797X , Way, Andy ORCID: 0000-0001-5736-5930 , Maillette de Buy Wenniger, Gideon and Passban, Peyman (2018) Investigating backtranslation in neural machine translation. In: 21st Annual Conference of The European Association for Machine Translation, 28-30 May 2018, Alicante, Spain.
Poncelas, Alberto ORCID: 0000-0002-5089-1687, Shterionov, Dimitar ORCID: 0000-0001-6300-797X , Way, Andy ORCID: 0000-0001-5736-5930 , Maillette de Buy Wenniger, Gideon and Passban, Peyman (2018) Investigating backtranslation in neural machine translation. In: 21st Annual Conference
Tilburg University-PURE
Universidad de Alicante (UA)
Scopus-Elsevier
Poncelas, Alberto ORCID: 0000-0002-5089-1687
Poncelas, Alberto ORCID: 0000-0002-5089-1687
Tilburg University-PURE
A prerequisite for training corpus-based machine translation (MT) systems – either Statistical MT (SMT) or Neural MT (NMT) – is the availability of high-quality parallel data. This is arguably more important today than ever before, as NMT has bee
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::694f2196d4c1b5f9e93f7ebf1b26b611
https://hdl.handle.net/10045/76085
https://hdl.handle.net/10045/76085
Publikováno v:
Poncelas, Alberto ORCID: 0000-0002-5089-1687 , Toral, Antonio ORCID: 0000-0003-2357-2960 and Way, Andy ORCID: 0000-0001-5736-5930 (2017) Extending feature decay algorithms using alignment entropy. In: FETLT 2016: Future and Emerging TrenFETLT 2016: Future and Emerging Trends in Language Technologies, Machine Learning and Big Datauage Technologies, Machine Learning and Big Data. 2nd International Workshop, 30 Nov-2 Dec 2016, Seville, Spain.
Lecture Notes in Computer Science ISBN: 9783319693644
FETLT
Lecture Notes in Computer Science ISBN: 9783319693644
FETLT
In machine-learning applications, data selection is of crucial importance if good runtime performance is to be achieved. Feature Decay Algorithms (FDA) have demonstrated excellent performance in a number of tasks. While the decay function is at the h
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::81f3cb6d6c8ce33cef3aa6a4e46cce48
http://doras.dcu.ie/23232/
http://doras.dcu.ie/23232/