Mayo

Autor: Yiren Zhao, Robert Mullins, Cheng-Zhong Xu, Xitong Gao
Rok vydání: 2018
Předmět:
Zdroj: EMDL@MobiSys
DOI: 10.1145/3212725.3212726
Popis: Deep Neural Networks (DNNs) have proved to be a conve- nient and powerful tool for a wide range of problems. How- ever, the extensive computational and memory resource re- quirements hinder the adoption of DNNs in resource-con- strained scenarios. Existing compression methods have been shown to significantly reduce the computation and mem- ory requirements of many popular DNNs. These methods, however, remain elusive to non-experts, as they demand ex- tensive manual tuning of hyperparameters. The effects of combining various compression techniques lack exploration because of the large design space. To alleviate these chal- lenges, this paper proposes an automated framework, Mayo, which is built on top of TensorFlow and can compress DNNs with minimal human intervention. First, we present over- riders which are recursively-compositional and can be con- figured to effectively compress individual components (e.g. weights, biases, layer computations and gradients) in a DNN. Second, we introduce novel heuristics and a global search al- gorithm to efficiently optimize hyperparameters. We demon- strate that without any manual tuning, Mayo generates a sparse ResNet-18 that is 5.13× smaller than the baseline with no loss in test accuracy. By composing multiple overriders, our tool produces a sparse 6-bit CIFAR-10 classifier with only 0.16% top-1 accuracy loss and a 34× compression rate. Mayo and all compressed models are publicly available. To our knowledge, Mayo is the first framework that supports overlapping multiple compression techniques and automati- cally optimizes hyperparameters in them.
Databáze: OpenAIRE