Zobrazeno 1 - 10
of 15
pro vyhledávání: '"Sergey Edunov"'
Autor:
Xian Li, Yinhan Liu, Jiatao Gu, Luke Zettlemoyer, Sergey Edunov, Michael Lewis, Naman Goyal, Marjan Ghazvininejad
Publikováno v:
Transactions of the Association for Computational Linguistics. 8:726-742
This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks. We present mBART -- a sequence-to-sequence denoising auto-encoder pre-trained on large-sc
Publikováno v:
ACL/IJCNLP (1)
We show that margin-based bitext mining in a multilingual sentence space can be successfully scaled to operate on monolingual corpora of billions of sentences. We use 32 snapshots of a curated common crawl corpus (Wenzel et al, 2019) totaling 71 bill
Autor:
Alex Xiao, Geoffrey Zweig, Christian Fuegen, Ross Girshick, Yatharth Saraf, Abdelrahman Mohamed, Kritika Singh, Sergey Edunov, Vitaliy Liptchinsky, Vimal Manohar
Publikováno v:
INTERSPEECH
Many semi- and weakly-supervised approaches have been investigated for overcoming the labeling cost of building high quality speech recognition systems. On the challenging task of transcribing social media videos in low-resource conditions, we conduc
Autor:
Sergey Edunov, Wen-tau Yih, Danqi Chen, Vladimir Karpukhin, Sewon Min, Barlas Oguz, Patrick S. H. Lewis, Ledell Wu
Publikováno v:
EMNLP (1)
Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method. In this work, we show that retrieval can be practically
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::4586e4a9f0e9488de8d1bb2d039be7a3
Autor:
Dmytro Okhonko, Ross Girshick, Frank Zhang, Geoffrey Zweig, Kritika Singh, Yatharth Saraf, Abdelrahman Mohamed, Sergey Edunov, Fuchun Peng, Jun Liu, Yongqiang Wang
Publikováno v:
ICASSP
Supervised ASR models have reached unprecedented levels of accuracy, thanks in part to ever-increasing amounts of labelled training data. However, in many applications and locales, only moderate amounts of data are available, which has led to a surge
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f6b7554dc74d7b1b7b91c5acee0a20fe
http://arxiv.org/abs/1910.12367
http://arxiv.org/abs/1910.12367
Publikováno v:
ACL
Back-translation is a widely used data augmentation technique which leverages target monolingual data. However, its effectiveness has been challenged since automatic metrics such as BLEU only show significant improvements for test examples where the
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5bb95df2b53bdcc07bf8c55b97d2a183
http://arxiv.org/abs/1908.05204
http://arxiv.org/abs/1908.05204
Publikováno v:
NAACL-HLT (1)
Pre-trained language model representations have been successful in a wide range of language understanding tasks. In this paper, we examine different strategies to integrate pre-trained representations into sequence to sequence models and apply it to
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::670577900f9c63411b60e9409f07b981
http://arxiv.org/abs/1903.09722
http://arxiv.org/abs/1903.09722
Publikováno v:
WMT@EMNLP
This paper describes Facebook AI’s submission to WMT20 shared news translation task. We focus on the low resource setting and participate in two language pairs, Tamil English and Inuktitut English, where there are limited out-of-domain bitext and m
Autor:
Sam Gross, Alexei Baevski, Michael Auli, David Grangier, Nathan Ng, Myle Ott, Sergey Edunov, Angela Fan
Publikováno v:
NAACL-HLT (Demonstrations)
fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks. The toolkit is based on PyTorch and supports distrib
Publikováno v:
EMNLP/IJCNLP (1)
We present a new approach for pretraining a bi-directional transformer model that provides significant performance gains across a variety of language understanding problems. Our model solves a cloze-style word reconstruction task, where each word is