Resource Efficient AI: Exploring Neural Network Pruning for Task Specialization

Autor:	Dieter Balemans, Philippe Reiter, Jan Steckel, Peter Hellinckx
Rok vydání:	2022
Předmět:	Computer. Automation History Polymers and Plastics Industrial and Manufacturing Engineering Computer Science Applications Artificial Intelligence Hardware and Architecture Management of Technology and Innovation Mass communications Computer Science (miscellaneous) Business and International Management Engineering (miscellaneous) Software Information Systems
Zdroj:	Internet of Things
ISSN:	1556-5068 2542-6605
DOI:	10.2139/ssrn.4158433
Popis:	This paper explores the use of neural network pruning for transfer learning applications for more resource-efficient inference. The goal is to focus and optimize a neural network on a smaller specialized target task. With the advent of IoT, we have seen an immense increase in AI-based applications on mobile and embedded devices, such as wearables and other smart appliances. However, with the ever-increasing complexity and capabilities of machine learning algorithms, this push to the edge has led to new challenges due to the constraints imposed by the limited availability of resources on these devices. Some form of compression is needed to allow for state-of-the-art convolutional neural networks to run on edge devices. In this work, we adapt existing neural network pruning methods to allow them to specialize networks to only focus on a subset of what they were originally trained for. This is a transfer learning use-case where we optimize large pre-trained networks. This differs from standard optimization techniques by allowing the network to forget certain concepts and allow the network’s footprint to be even smaller. We compare different pruning criteria, including one from the field of Explainable AI (XAI), to determine which technique allows for the smallest possible network while maintaining high performance on the target task. Our results show the benefits of using network specialization when executing neural networks on embedded devices both with and without GPU acceleration.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::ec1ddfc3c10882572669e1b2eb120acd https://doi.org/10.2139/ssrn.4158433 Zobrazit plný text záznamu Full Text from ScienceDirect