Improving Universal Language Model Fine-Tuning using Attention Mechanism

Autor:	David Macêdo, Cleber Zanchettin, Flávio Arthur O. Santos, K. L. Ponce-Guevara
Rok vydání:	2019
Předmět:	Sequence Computer science business.industry Pooling 02 engineering and technology 010501 environmental sciences Machine learning computer.software_genre 01 natural sciences Inductive transfer 020204 information systems 0202 electrical engineering electronic engineering information engineering Embedding Language model Artificial intelligence Transfer of learning business Classifier (UML) computer Word (computer architecture) 0105 earth and related environmental sciences
Zdroj:	IJCNN
DOI:	10.1109/ijcnn.2019.8852398
Popis:	Inductive transfer learning is widespread in computer vision applications. However, in natural language processing (NLP) applications is still an under-explored area. The most common transfer learning method in NLP is the use of pre-trained word embeddings. The Universal Language Model Fine-Tuning (ULMFiT) is a recent approach which proposes to train a language model and transfer its knowledge to a final classifier. During the classification step, ULMFiT uses a max and average pooling layer to select the useful information of an embedding sequence. We propose to replace max and average pooling layers with a soft attention mechanism. The goal is to learn the most important information of the embedding sequence rather than assuming that they are max and average values. We evaluate the proposed approach in six datasets and achieve the best performance in all of them against literature approaches.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::78ffa4d5ddefc8ac460d0c7aa1b1533d https://doi.org/10.1109/ijcnn.2019.8852398 Zobrazit plný text záznamu