Zobrazeno 1 - 10
of 12
pro vyhledávání: '"Richard Yuanzhe Pang"'
Transformer-based models generally allocate the same amount of computation for each token in a given sequence. We develop a simple but effective "token dropping" method to accelerate the pretraining of transformer models, such as BERT, without degrad
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8e4ad3925fd4be8c6832a56cd767a0b9
Autor:
Phu Mon Htut, William C. Huang, Samuel R. Bowman, Haokun Liu, Jason Phang, Clara Vania, Richard Yuanzhe Pang, Kyunghyun Cho, Dhara A. Mungra
Publikováno v:
ACL/IJCNLP (1)
Recent years have seen numerous NLP datasets introduced to evaluate the performance of fine-tuned models on natural language understanding tasks. Recent results from large pretrained models, though, show that many of these datasets are largely satura
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a6312188ae4bcb9814b2733c9e88f024
http://arxiv.org/abs/2106.00840
http://arxiv.org/abs/2106.00840
Publikováno v:
ACL/IJCNLP (Findings)
We aim to renew interest in a particular multi-document summarization (MDS) task which we call AgreeSum: agreement-oriented multi-document summarization. Given a cluster of articles, the goal is to provide abstractive summaries that represent informa
Autor:
Richard Yuanzhe Pang, Alicia Parrish, Nitish Joshi, Nikita Nangia, Jason Phang, Angelica Chen, Vishakh Padmakumar, Johnny Ma, Jana Thompson, He He, Samuel Bowman
To enable building and testing models on long-document comprehension, we introduce QuALITY, a multiple-choice QA dataset with context passages in English that have an average length of about 5,000 tokens, much longer than typical current models can p
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::deec7575f0315fb6b9b5b877d5291edc
Publikováno v:
EMNLP (1)
Despite strong performance on a variety of tasks, neural sequence models trained with maximum likelihood have been shown to exhibit issues such as length bias and degenerate repetition. We study the related issue of receiving infinite-length sequence
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f90a716e88887ab2e3f0d4a142ebec9b
http://arxiv.org/abs/2002.02492
http://arxiv.org/abs/2002.02492
Publikováno v:
ACL
We propose to train a non-autoregressive machine translation model to minimize the energy defined by a pretrained autoregressive model. In particular, we view our non-autoregressive translation system as an inference network (Tu and Gimpel, 2018) tra
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1c74e41c77d0889a13589f5826de2cbb
Publikováno v:
SPNLP@EMNLP
Deep energy-based models are powerful, but pose challenges for learning and inference (Belanger and McCallum, 2016). Tu and Gimpel (2018) developed an efficient framework for energy-based models by training "inference networks" to approximate structu
Autor:
Richard Yuanzhe Pang, Haokun Liu, Phu Mon Htut, Xiaoyi Zhang, Yada Pruksachatkun, Clara Vania, Katharina Kann, Samuel R. Bowman, Jason Phang
Publikováno v:
ACL
While pretrained models such as BERT have shown large gains across natural language understanding tasks, their performance can be improved by further training the model on a data-rich intermediate task, before fine-tuning it on a target task. However
Autor:
Richard Yuanzhe Pang, Kevin Gimpel
Publikováno v:
NGT@EMNLP-IJCNLP
We consider the problem of automatically generating textual paraphrases with modified attributes or properties, focusing on the setting without parallel data (Hu et al., 2017; Shen et al., 2017). This setting poses challenges for evaluation. We show
Autor:
Richard Yuanzhe Pang
Publikováno v:
W-NUT@EMNLP
Regarding the problem of automatically generating paraphrases with modified styles or attributes, the difficulty lies in the lack of parallel corpora. Numerous advances have been proposed for the generation. However, significant problems remain with