Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Chowdhury, Mohammed Nowaz Rabbani"'
Autor:
Chowdhury, Mohammed Nowaz Rabbani, Wang, Meng, Maghraoui, Kaoutar El, Wang, Naigang, Chen, Pin-Yu, Carothers, Christopher
Publikováno v:
The 41st International Conference on Machine Learning, ICML 2024
The sparsely gated mixture of experts (MoE) architecture sends different inputs to different subnetworks, i.e., experts, through trainable routers. MoE reduces the training computation significantly for large models, but its deployment can be still m
Externí odkaz:
http://arxiv.org/abs/2405.16646
Publikováno v:
The 40th International Conference on Machine Learning (ICML), 2023
In deep learning, mixture-of-experts (MoE) activates one or few experts (sub-networks) on a per-sample or per-token basis, resulting in significant computation reduction. The recently proposed \underline{p}atch-level routing in \underline{MoE} (pMoE)
Externí odkaz:
http://arxiv.org/abs/2306.04073