Zobrazeno 1 - 10
of 40
pro vyhledávání: '"Stoyanov, Veselin"'
Large and sparse feed-forward layers (S-FFN) such as Mixture-of-Experts (MoE) have proven effective in scaling up Transformers model size for \textit{pretraining} large language models. By only activating part of the FFN parameters conditioning on in
Externí odkaz:
http://arxiv.org/abs/2305.13999
Autor:
Chen, Mingda, Du, Jingfei, Pasunuru, Ramakanth, Mihaylov, Todor, Iyer, Srini, Stoyanov, Veselin, Kozareva, Zornitsa
Self-supervised pretraining has made few-shot learning possible for many NLP tasks. But the pretraining objectives are not typically adapted specifically for in-context few-shot learning. In this paper, we propose to use self-supervision in an interm
Externí odkaz:
http://arxiv.org/abs/2205.01703
Autor:
Mahabadi, Rabeeh Karimi, Zettlemoyer, Luke, Henderson, James, Saeidi, Marzieh, Mathias, Lambert, Stoyanov, Veselin, Yazdani, Majid
Current methods for few-shot fine-tuning of pretrained masked language models (PLMs) require carefully engineered prompts and verbalizers for each new task to convert examples into a cloze-format that the PLM can score. In this work, we propose PERFE
Externí odkaz:
http://arxiv.org/abs/2204.01172
Autor:
Lin, Xi Victoria, Mihaylov, Todor, Artetxe, Mikel, Wang, Tianlu, Chen, Shuohui, Simig, Daniel, Ott, Myle, Goyal, Naman, Bhosale, Shruti, Du, Jingfei, Pasunuru, Ramakanth, Shleifer, Sam, Koura, Punit Singh, Chaudhary, Vishrav, O'Horo, Brian, Wang, Jeff, Zettlemoyer, Luke, Kozareva, Zornitsa, Diab, Mona, Stoyanov, Veselin, Li, Xian
Large-scale generative language models such as GPT-3 are competitive few-shot learners. While these models are known to be able to jointly represent many different languages, their training data is dominated by English, potentially limiting their cro
Externí odkaz:
http://arxiv.org/abs/2112.10668
Autor:
Hase, Peter, Diab, Mona, Celikyilmaz, Asli, Li, Xian, Kozareva, Zornitsa, Stoyanov, Veselin, Bansal, Mohit, Iyer, Srinivasan
Do language models have beliefs about the world? Dennett (1995) famously argues that even thermostats have beliefs, on the view that a belief is simply an informational state decoupled from any motivational state. In this paper, we discuss approaches
Externí odkaz:
http://arxiv.org/abs/2111.13654
Autor:
Maillard, Jean, Karpukhin, Vladimir, Petroni, Fabio, Yih, Wen-tau, Oğuz, Barlas, Stoyanov, Veselin, Ghosh, Gargi
Retrieving relevant contexts from a large corpus is a crucial step for tasks such as open-domain question answering and fact checking. Although neural retrieval outperforms traditional methods like tf-idf and BM25, its performance degrades considerab
Externí odkaz:
http://arxiv.org/abs/2101.00117
The state of the art on many NLP tasks is currently achieved by large pre-trained language models, which require a considerable amount of computation. We explore a setting where many different predictions are made on a single piece of text. In that c
Externí odkaz:
http://arxiv.org/abs/2004.14287
Recent breakthroughs of pretrained language models have shown the effectiveness of self-supervised learning for a wide range of natural language processing (NLP) tasks. In addition to standard syntactic and semantic NLP tasks, pretrained models achie
Externí odkaz:
http://arxiv.org/abs/1912.09637
Autor:
Nakov, Preslav, Kozareva, Zornitsa, Ritter, Alan, Rosenthal, Sara, Stoyanov, Veselin, Wilson, Theresa
Publikováno v:
SemEval-2013
In recent years, sentiment analysis in social media has attracted a lot of research interest and has been used for a number of applications. Unfortunately, research has been hindered by the lack of suitable datasets, complicating the comparison betwe
Externí odkaz:
http://arxiv.org/abs/1912.06806
Publikováno v:
SemEval-2014
We describe the Sentiment Analysis in Twitter task, ran as part of SemEval-2014. It is a continuation of the last year's task that ran successfully as part of SemEval-2013. As in 2013, this was the most popular SemEval task; a total of 46 teams contr
Externí odkaz:
http://arxiv.org/abs/1912.02990