Fine-grained analysis of the transformer model for efficient pruning

Autor:	Leila Ben Letaifa, Jean-Luc Rouas
Přispěvatelé:	Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS), European Project: 101016776
Jazyk:	angličtina
Rok vydání:	2022
Předmět:	Speech recognition transformer model pruning techniques weight magnitude model analysis transformer model model analysis weight magnitude [INFO]Computer Science [cs] Speech recognition pruning techniques
Zdroj:	2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA) 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA), Dec 2022, Nassau, Bahamas. pp.897-902, ⟨10.1109/ICMLA55696.2022.00149⟩
DOI:	10.1109/ICMLA55696.2022.00149⟩
Popis:	In automatic speech recognition, deep learning models such as transformers are increasingly used for their high performance. However, they suffer from their large size, which makes it very difficult to use them in real contexts. Hence the idea of pruning them. Conventional pruning methods are not optimal and sometimes not efficient since they operate blindly without taking into account the nature of the layers or their number of parameters or their distribution. In this work, we propose to perform a fine-grained analysis of the transformer model layers in order to determine the most efficient pruning approach. We show that it is more appropriate to prune some layers than others and underline the importance of knowing the behavior of the layers to choose the pruning approach.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::24be8e62830cf44ba3efacc37bc3a2a6 https://hal.science/hal-04047338/file/conference_101719.pdf Zobrazit plný text záznamu