Zobrazeno 1 - 10
of 727
pro vyhledávání: '"A. Kazienko"'
Large language models (LLMs) often necessitate extensive labeled datasets and training compute to achieve impressive performance across downstream tasks. This paper explores a self-training paradigm, where the LLM autonomously curates its own labels
Externí odkaz:
http://arxiv.org/abs/2406.11275
Autor:
Peng, Bo, Goldstein, Daniel, Anthony, Quentin, Albalak, Alon, Alcaide, Eric, Biderman, Stella, Cheah, Eugene, Du, Xingjian, Ferdinan, Teddy, Hou, Haowen, Kazienko, Przemysław, GV, Kranthi Kiran, Kocoń, Jan, Koptyra, Bartłomiej, Krishna, Satyapriya, McClelland Jr., Ronald, Lin, Jiaju, Muennighoff, Niklas, Obeid, Fares, Saito, Atsushi, Song, Guangyu, Tu, Haoqin, Wirawan, Cahya, Woźniak, Stanisław, Zhang, Ruichong, Zhao, Bingchen, Zhao, Qihang, Zhou, Peng, Zhu, Jian, Zhu, Rui-Jie
We present Eagle (RWKV-5) and Finch (RWKV-6), sequence models improving upon the RWKV (RWKV-4) architecture. Our architectural design advancements include multi-headed matrix-valued states and a dynamic recurrence mechanism that improve expressivity
Externí odkaz:
http://arxiv.org/abs/2404.05892
Large language models (LLMs) have significantly advanced Natural Language Processing (NLP) tasks in recent years. However, their universal nature poses limitations in scenarios requiring personalized responses, such as recommendation systems and chat
Externí odkaz:
http://arxiv.org/abs/2402.09269
We address the main problem of self-learning LLM: the question of what to learn. We propose a self-learning LLM framework that enables an LLM to independently learn previously unknown knowledge through self-assessment of their own hallucinations. We
Externí odkaz:
http://arxiv.org/abs/2402.09147
The vast area of subjectivity in Natural Language Processing (NLP) poses a challenge to the solutions typically used in generalized tasks. As exploration in the scope of generalized NLP is much more advanced, it implies the tremendous gap that is sti
Externí odkaz:
http://arxiv.org/abs/2312.11296
Autor:
Kanclerz, Kamil, Bielaniewicz, Julita, Gruza, Marcin, Kocon, Jan, Woźniak, Stanisław, Kazienko, Przemysław
Data annotated by humans is a source of knowledge by describing the peculiarities of the problem and therefore fueling the decision process of the trained model. Unfortunately, the annotation process for subjective natural language processing (NLP) p
Externí odkaz:
http://arxiv.org/abs/2312.08198
Autor:
Miłkowski, Piotr, Karanowski, Konrad, Wielopolski, Patryk, Kocoń, Jan, Kazienko, Przemysław, Zięba, Maciej
Designing predictive models for subjective problems in natural language processing (NLP) remains challenging. This is mainly due to its non-deterministic nature and different perceptions of the content by different humans. It may be solved by Persona
Externí odkaz:
http://arxiv.org/abs/2312.06034
Autor:
Avramidis, Kleanthis, Kunc, Dominika, Perz, Bartosz, Adsul, Kranti, Feng, Tiantian, Kazienko, Przemysław, Saganowski, Stanisław, Narayanan, Shrikanth
Ubiquitous sensing from wearable devices in the wild holds promise for enhancing human well-being, from diagnosing clinical conditions and measuring stress to building adaptive health promoting scaffolds. But the large volumes of data therein across
Externí odkaz:
http://arxiv.org/abs/2309.15292
Autor:
Peng, Bo, Alcaide, Eric, Anthony, Quentin, Albalak, Alon, Arcadinho, Samuel, Biderman, Stella, Cao, Huanqi, Cheng, Xin, Chung, Michael, Grella, Matteo, GV, Kranthi Kiran, He, Xuzheng, Hou, Haowen, Lin, Jiaju, Kazienko, Przemyslaw, Kocon, Jan, Kong, Jiaming, Koptyra, Bartlomiej, Lau, Hayden, Mantri, Krishna Sri Ipsit, Mom, Ferdinand, Saito, Atsushi, Song, Guangyu, Tang, Xiangru, Wang, Bolun, Wind, Johan S., Wozniak, Stanislaw, Zhang, Ruichong, Zhang, Zhenyuan, Zhao, Qihang, Zhou, Peng, Zhou, Qinghua, Zhu, Jian, Zhu, Rui-Jie
Transformers have revolutionized almost all natural language processing (NLP) tasks but suffer from memory and computational complexity that scales quadratically with sequence length. In contrast, recurrent neural networks (RNNs) exhibit linear scali
Externí odkaz:
http://arxiv.org/abs/2305.13048
Autor:
Kocoń, Jan, Cichecki, Igor, Kaszyca, Oliwier, Kochanek, Mateusz, Szydło, Dominika, Baran, Joanna, Bielaniewicz, Julita, Gruza, Marcin, Janz, Arkadiusz, Kanclerz, Kamil, Kocoń, Anna, Koptyra, Bartłomiej, Mieleszczenko-Kowszewicz, Wiktoria, Miłkowski, Piotr, Oleksy, Marcin, Piasecki, Maciej, Radliński, Łukasz, Wojtasik, Konrad, Woźniak, Stanisław, Kazienko, Przemysław
Publikováno v:
Information Fusion 101861 (2023)
OpenAI has released the Chat Generative Pre-trained Transformer (ChatGPT) and revolutionized the approach in artificial intelligence to human-model interaction. Several publications on ChatGPT evaluation test its effectiveness on well-known natural l
Externí odkaz:
http://arxiv.org/abs/2302.10724