Zobrazeno 1 - 10
of 15
pro vyhledávání: '"Dan Garrette"'
Publikováno v:
Computational Linguistics. :1-29
Large multilingual language models typically share their parameters across all languages, which enables cross-lingual task transfer, but learning can also be hindered when training updates from different languages are in conflict. In this article, we
Autor:
Jennimaria Palomaki, Jonathan H. Clark, Tom Kwiatkowski, Eunsol Choi, Michael Collins, Vitaly Nikolaev, Dan Garrette
Publikováno v:
Transactions of the Association for Computational Linguistics. 8:454-470
Confidently making progress on multilingual modeling requires challenging, trustworthy evaluations. We present TyDi QA—a question answering dataset covering 11 typologically diverse languages with 204K question-answer pairs. The languages of TyDi Q
Autor:
Sebastian Ruder, Noah Constant, Jan Botha, Aditya Siddhant, Orhan Firat, Jinlan Fu, Pengfei Liu, Junjie Hu, Dan Garrette, Graham Neubig, Melvin Johnson
Machine learning has brought striking advances in multilingual natural language processing capabilities over the past year. For example, the latest techniques have improved the state-of-the-art performance on the XTREME multilingual benchmark by more
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::971d36ebc711e18e4116896239f94bcc
http://arxiv.org/abs/2104.07412
http://arxiv.org/abs/2104.07412
Publikováno v:
EMNLP (1)
State-of-the-art multilingual models depend on vocabularies that cover all of the languages the model will expect to see at inference time, but the standard methods for generating those vocabularies are not ideal for massively multilingual applicatio
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::623fc1d9776966a7676ce9268cb1ba06
Publikováno v:
ACL (1)
In this paper, we show that Multilingual BERT (M-BERT), released by Devlin et al. (2018) as a single language model pre-trained from monolingual corpora in 104 languages, is surprisingly good at zero-shot cross-lingual model transfer, in which task-s
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::3e12ce08a46e69b2ff172e4adf416453
http://arxiv.org/abs/1906.01502
http://arxiv.org/abs/1906.01502
Autor:
Dan Garrette, Kelsey Ball
Publikováno v:
EMNLP
Code-switching, the use of more than one language within a single utterance, is ubiquitous in much of the world, but remains a challenge for NLP largely due to the lack of representative data for training models. In this paper, we present a novel mod
Publikováno v:
CoNLL
We present a Bayesian formulation for weakly-supervised learning of a Combinatory Categorial Grammar (CCG) supertagger with an HMM. We assume supervision in the form of a tag dictionary, and our prior encourages the use of crosslinguistically common
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::fdb49c3d3204ca89664ce4dfcfba01c8
Publikováno v:
ACL (2)
Compositor attribution, the clustering of pages in a historical printed document by the individual who set the type, is a bibliographic task that relies on analysis of orthographic variation and inspection of visual details of the printed page. In th
Autor:
Gina-Anne Levow, Patrick Littell, David Inman, Michael Tjalve, Joshua Crowgey, Jeff Good, Shobhana Lakshmi Chelliah, Sharon Hargus, Dan Garrette, Fei Xia, Emily M. Bender, Michael Maxwell, Kristen Howell
Publikováno v:
Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages.
This paper describes the use of Shared Task Evaluation Campaigns by designing tasks that are compelling to speech and natural language processing researchers while addressing technical challenges in language documentation and exploiting growing archi
Autor:
Hannah Alpert-Abrams, Dan Garrette
Publikováno v:
HLT-NAACL