Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks

Autor:	Campos, Víctor, Jou, Brendan, Giró Nieto, Xavier, Torres Viñals, Jordi, Chang, Shih-Fu
Přispěvatelé:	Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya. GPI - Grup de Processament d'Imatge i Vídeo, Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions, Barcelona Supercomputing Center
Jazyk:	angličtina
Rok vydání:	2017
Předmět:	FOS: Computer and information sciences Matemàtiques i estadística::Anàlisi numèrica::Modelització matemàtica [Àrees temàtiques de la UPC] Grafs Teoria de Computer Science - Artificial Intelligence Computer Vision and Pattern Recognition (cs.CV) Adaptive Computation Computer Science - Computer Vision and Pattern Recognition Knowledge representation (Information theory) Graph theory Neural networks (Computer science) Artificial Intelligence (cs.AI) Deep Learning Natural language processing (Computer science) Representació del coneixement (Teoria de la informació) Xarxes neuronals (Informàtica) dynamic learning conditional computation High performance computing Tractament del llenguatge natural (Informàtica) Informàtica::Intel·ligència artificial::Llenguatge natural [Àrees temàtiques de la UPC] Càlcul intensiu (Informàtica) Informàtica::Arquitectura de computadors [Àrees temàtiques de la UPC] Recurrent Neural Networks
Zdroj:	UPCommons. Portal del coneixement obert de la UPC Universitat Politècnica de Catalunya (UPC) Recercat. Dipósit de la Recerca de Catalunya instname
Popis:	Recurrent Neural Networks (RNNs) continue to show outstanding performance in sequence modeling tasks. However, training RNNs on long sequences often face challenges like slow inference, vanishing gradients and difficulty in capturing long term dependencies. In backpropagation through time settings, these issues are tightly coupled with the large, sequential computational graph resulting from unfolding the RNN in time. We introduce the Skip RNN model which extends existing RNN models by learning to skip state updates and shortens the effective size of the computational graph. This model can also be encouraged to perform fewer state updates through a budget constraint. We evaluate the proposed model on various tasks and show how it can reduce the number of required RNN updates while preserving, and sometimes even improving, the performance of the baseline RNN models. Source code is publicly available at https://imatge-upc.github.io/skiprnn-2017-telecombcn/ . Accepted as conference paper at ICLR 2018
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f3333f824d6de6d6a7b097bc4438bdf8 http://arxiv.org/abs/1708.06834 Zobrazit plný text záznamu