Výsledky vyhledávání - "Pióro, Maciej"

Report

State Soup: In-Context Skill Learning, Retrieval and Mixing

Autor: Pióro, Maciej, Wołczyk, Maciej, Pascanu, Razvan, von Oswald, Johannes, Sacramento, João

A new breed of gated-linear recurrent neural networks has reached state-of-the-art performance on a range of sequence modeling problems. Such models naturally handle long sequences efficiently, as the cost of processing a new input is independent of

Externí odkaz: http://arxiv.org/abs/2406.08423

Zobrazit plný text záznamu

Report

Scaling Laws for Fine-Grained Mixture of Experts

Autor: Krajewski, Jakub, Ludziejewski, Jan, Adamczewski, Kamil, Pióro, Maciej, Krutul, Michał, Antoniak, Szymon, Ciebiera, Kamil, Król, Krystian, Odrzygóźdź, Tomasz, Sankowski, Piotr, Cygan, Marek, Jaszczur, Sebastian

Mixture of Experts (MoE) models have emerged as a primary solution for reducing the computational cost of Large Language Models. In this work, we analyze their scaling properties, incorporating an expanded range of variables. Specifically, we introdu

Externí odkaz: http://arxiv.org/abs/2402.07871

Zobrazit plný text záznamu

Report

MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

Autor: Pióro, Maciej, Ciebiera, Kamil, Król, Krystian, Ludziejewski, Jan, Krutul, Michał, Krajewski, Jakub, Antoniak, Szymon, Miłoś, Piotr, Cygan, Marek, Jaszczur, Sebastian

State Space Models (SSMs) have become serious contenders in the field of sequential modeling, challenging the dominance of Transformers. At the same time, Mixture of Experts (MoE) has significantly improved Transformer-based Large Language Models, in

Externí odkaz: http://arxiv.org/abs/2401.04081

Zobrazit plný text záznamu

Report

Mixture of Tokens: Continuous MoE through Cross-Example Aggregation

Autor: Antoniak, Szymon, Krutul, Michał, Pióro, Maciej, Krajewski, Jakub, Ludziejewski, Jan, Ciebiera, Kamil, Król, Krystian, Odrzygóźdź, Tomasz, Cygan, Marek, Jaszczur, Sebastian

Mixture of Experts (MoE) models based on Transformer architecture are pushing the boundaries of language and vision tasks. The allure of these models lies in their ability to substantially increase the parameter count without a corresponding increase

Externí odkaz: http://arxiv.org/abs/2310.15961

Zobrazit plný text záznamu

Report

Efficient Single-Image Depth Estimation on Mobile Devices, Mobile AI & AIM 2022 Challenge: Report

Various depth estimation models are now widely used on many mobile and IoT devices for image segmentation, bokeh effect rendering, object tracking and many other mobile tasks. Thus, it is very crucial to have efficient and accurate depth estimation m

Externí odkaz: http://arxiv.org/abs/2211.04470

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání