Cooperative Convolutional Neural Network Deployment over Mobile Networks
Autor: | Chung-Kai Yang, Jang-Ping Sheu, Wen-Tsuen Chen, Jian-Jhih Kuo, Chia-Chun Hsu |
---|---|
Rok vydání: | 2020 |
Předmět: |
Optimization problem
Computer science Distributed computing 05 social sciences Inference 050801 communication & media studies Thread (computing) Load balancing (computing) Partition (database) Convolutional neural network 0508 media and communications Server 0502 economics and business 050211 marketing Enhanced Data Rates for GSM Evolution |
Zdroj: | ICC |
Popis: | Inference acceleration has drawn much attention to cope with the real-time requirement of artificial intelligence (AI) applications. To this end, model partition for Deep Neural Networks (DNN) has been proposed to utilize the parallel and distributed computing units. However, the previous works focus on the load balancing among servers but may overlook the interplay between the computing and communication. This issue makes the existing approaches less efficient especially in mobile edge networks at which smart devices usually with limited computing capacity have to offload the tasks via limited bandwidth capacity to nearby servers. In this paper, therefore, we innovate a new system and formulate a new optimization problem, CONVENE, to minimize the completion time of inference for the smart devices with one or more antennas. To explore the intrinsic properties, we first study CONVENE with Single Antenna and derive an algorithm termed THREAD-SA to foster the optimum solution. Then, an extension, THREAD, is proposed to subtly utilize multiple antennas to further reduce completion time. Simulation results manifest that our algorithm outperforms others by 100%. |
Databáze: | OpenAIRE |
Externí odkaz: |