Zobrazeno 1 - 6
of 6
pro vyhledávání: '"Pióro, Maciej"'
A new breed of gated-linear recurrent neural networks has reached state-of-the-art performance on a range of sequence modeling problems. Such models naturally handle long sequences efficiently, as the cost of processing a new input is independent of
Externí odkaz:
http://arxiv.org/abs/2406.08423
Autor:
Krajewski, Jakub, Ludziejewski, Jan, Adamczewski, Kamil, Pióro, Maciej, Krutul, Michał, Antoniak, Szymon, Ciebiera, Kamil, Król, Krystian, Odrzygóźdź, Tomasz, Sankowski, Piotr, Cygan, Marek, Jaszczur, Sebastian
Mixture of Experts (MoE) models have emerged as a primary solution for reducing the computational cost of Large Language Models. In this work, we analyze their scaling properties, incorporating an expanded range of variables. Specifically, we introdu
Externí odkaz:
http://arxiv.org/abs/2402.07871
Autor:
Pióro, Maciej, Ciebiera, Kamil, Król, Krystian, Ludziejewski, Jan, Krutul, Michał, Krajewski, Jakub, Antoniak, Szymon, Miłoś, Piotr, Cygan, Marek, Jaszczur, Sebastian
State Space Models (SSMs) have become serious contenders in the field of sequential modeling, challenging the dominance of Transformers. At the same time, Mixture of Experts (MoE) has significantly improved Transformer-based Large Language Models, in
Externí odkaz:
http://arxiv.org/abs/2401.04081
Autor:
Antoniak, Szymon, Krutul, Michał, Pióro, Maciej, Krajewski, Jakub, Ludziejewski, Jan, Ciebiera, Kamil, Król, Krystian, Odrzygóźdź, Tomasz, Cygan, Marek, Jaszczur, Sebastian
Mixture of Experts (MoE) models based on Transformer architecture are pushing the boundaries of language and vision tasks. The allure of these models lies in their ability to substantially increase the parameter count without a corresponding increase
Externí odkaz:
http://arxiv.org/abs/2310.15961
Autor:
Ignatov, Andrey, Malivenko, Grigory, Timofte, Radu, Treszczotko, Lukasz, Chang, Xin, Ksiazek, Piotr, Lopuszynski, Michal, Pioro, Maciej, Rudnicki, Rafal, Smyl, Maciej, Ma, Yujie, Li, Zhenyu, Chen, Zehui, Xu, Jialei, Liu, Xianming, Jiang, Junjun, Shi, XueChao, Xu, Difan, Li, Yanan, Wang, Xiaotao, Lei, Lei, Zhang, Ziyu, Wang, Yicheng, Huang, Zilong, Luo, Guozhong, Yu, Gang, Fu, Bin, Li, Jiaqi, Wang, Yiran, Huang, Zihao, Cao, Zhiguo, Conde, Marcos V., Sapozhnikov, Denis, Lee, Byeong Hyun, Park, Dongwon, Hong, Seongmin, Lee, Joonhee, Lee, Seunggyu, Chun, Se Young
Various depth estimation models are now widely used on many mobile and IoT devices for image segmentation, bokeh effect rendering, object tracking and many other mobile tasks. Thus, it is very crucial to have efficient and accurate depth estimation m
Externí odkaz:
http://arxiv.org/abs/2211.04470
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.