Improving Universal Language Model Fine-Tuning using Attention Mechanism
Autor: | David Macêdo, Cleber Zanchettin, Flávio Arthur O. Santos, K. L. Ponce-Guevara |
---|---|
Rok vydání: | 2019 |
Předmět: |
Sequence
Computer science business.industry Pooling 02 engineering and technology 010501 environmental sciences Machine learning computer.software_genre 01 natural sciences Inductive transfer 020204 information systems 0202 electrical engineering electronic engineering information engineering Embedding Language model Artificial intelligence Transfer of learning business Classifier (UML) computer Word (computer architecture) 0105 earth and related environmental sciences |
Zdroj: | IJCNN |
DOI: | 10.1109/ijcnn.2019.8852398 |
Popis: | Inductive transfer learning is widespread in computer vision applications. However, in natural language processing (NLP) applications is still an under-explored area. The most common transfer learning method in NLP is the use of pre-trained word embeddings. The Universal Language Model Fine-Tuning (ULMFiT) is a recent approach which proposes to train a language model and transfer its knowledge to a final classifier. During the classification step, ULMFiT uses a max and average pooling layer to select the useful information of an embedding sequence. We propose to replace max and average pooling layers with a soft attention mechanism. The goal is to learn the most important information of the embedding sequence rather than assuming that they are max and average values. We evaluate the proposed approach in six datasets and achieve the best performance in all of them against literature approaches. |
Databáze: | OpenAIRE |
Externí odkaz: |