Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Cohn, Gabrielle"'
We introduce EELBERT, an approach for compression of transformer-based models (e.g., BERT), with minimal impact on the accuracy of downstream tasks. This is achieved by replacing the input embedding layer of the model with dynamic, i.e. on-the-fly, e
Externí odkaz:
http://arxiv.org/abs/2310.20144