A Lightweight Transformer with Convolutional Attention
Autor: | Kungan Zeng, Incheon Paik |
---|---|
Rok vydání: | 2020 |
Předmět: |
Structure (mathematical logic)
Machine translation Computer science business.industry Deep learning 020206 networking & telecommunications 02 engineering and technology Construct (python library) 010501 environmental sciences computer.software_genre 01 natural sciences Convolution Recurrent neural network Computer engineering 0202 electrical engineering electronic engineering information engineering Artificial intelligence business computer Decoding methods 0105 earth and related environmental sciences Transformer (machine learning model) |
Zdroj: | iCAST |
DOI: | 10.1109/icast51195.2020.9319489 |
Popis: | Neural machine translation (NMT) goes through rapid development because of the application of various deep learning techs. Especially, how to construct a more effective structure of NMT attracts more and more attention. Transformer is a state-of-the-art architecture in NMT. It replies on the self-attention mechanism exactly instead of recurrent neural networks (RNN). The Multi-head attention is a crucial part that implements the self-attention mechanism, and it also dramatically affects the scale of the model. In this paper, we present a new Multi-head attention by combining convolution operation. In comparison with the base Transformer, our approach can reduce the number of parameters effectively. And we perform a reasoned experiment. The result shows that the performance of the new model is similar to the base model. |
Databáze: | OpenAIRE |
Externí odkaz: |