Výsledky vyhledávání - "Bozic, Vukasin"

Report

Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers

Autor: Bozic, Vukasin, Dordevic, Danilo, Coppola, Daniele, Thommes, Joseph, Singh, Sidak Pal

This work presents an analysis of the effectiveness of using standard shallow feed-forward networks to mimic the behavior of the attention mechanism in the original Transformer model, a state-of-the-art architecture for sequence-to-sequence tasks. We

Externí odkaz: http://arxiv.org/abs/2311.10642

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání