Compression of Neural Machine Translation Models via Pruning

Autor:	Abigail See, Christopher D. Manning, Minh-Thang Luong
Rok vydání:	2016
Předmět:	FOS: Computer and information sciences Computer Science - Computation and Language Machine translation Computer science business.industry Computer Science - Artificial Intelligence Deep learning Computer Science - Neural and Evolutionary Computing 020206 networking & telecommunications Pattern recognition 02 engineering and technology Translation (geometry) computer.software_genre 020202 computer hardware & architecture Task (computing) Artificial Intelligence (cs.AI) Compression (functional analysis) 0202 electrical engineering electronic engineering information engineering Redundancy (engineering) Pruning (decision trees) Artificial intelligence Neural and Evolutionary Computing (cs.NE) business computer Computation and Language (cs.CL)
Zdroj:	CoNLL
DOI:	10.48550/arxiv.1606.09274
Popis:	Neural Machine Translation (NMT), like many other deep learning domains, typically suffers from over-parameterization, resulting in large storage sizes. This paper examines three simple magnitude-based pruning schemes to compress NMT models, namely class-blind, class-uniform, and class-distribution, which differ in terms of how pruning thresholds are computed for the different classes of weights in the NMT architecture. We demonstrate the efficacy of weight pruning as a compression technique for a state-of-the-art NMT system. We show that an NMT model with over 200 million parameters can be pruned by 40% with very little performance loss as measured on the WMT'14 English-German translation task. This sheds light on the distribution of redundancy in the NMT architecture. Our main result is that with retraining, we can recover and even surpass the original performance with an 80%-pruned model. Comment: Accepted to CoNLL 2016. 9 pages plus references
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::56189167356fdc4c068d0ad236ae3139 Zobrazit plný text záznamu