TY - GEN
T1 - Dynamic Split Computing for Efficient Deep Edge Intelligence
AU - Bakhtiarnia, Arian
AU - Milošević, Nemanja
AU - Zhang, Qi
AU - Bajović, Dragana
AU - Iosifidis, Alexandros
PY - 2023
Y1 - 2023
N2 - Deploying deep neural networks (DNNs) on IoT and mobile devices is a challenging task due to their limited computational resources. Thus, demanding tasks are often entirely offloaded to edge servers which can accelerate inference, however, it also causes communication cost and evokes privacy concerns. In addition, this approach leaves the computational capacity of end devices unused. Split computing is a paradigm where a DNN is split into two sections; the first section is executed on the end device, and the output is transmitted to the edge server where the final section is executed. Here, we introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel. By using natural bottlenecks that already exist in modern DNN architectures, dynamic split computing avoids retraining and hyperparameter optimization, and does not have any negative impact on the final accuracy of DNNs. Through extensive experiments, we show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.
AB - Deploying deep neural networks (DNNs) on IoT and mobile devices is a challenging task due to their limited computational resources. Thus, demanding tasks are often entirely offloaded to edge servers which can accelerate inference, however, it also causes communication cost and evokes privacy concerns. In addition, this approach leaves the computational capacity of end devices unused. Split computing is a paradigm where a DNN is split into two sections; the first section is executed on the end device, and the output is transmitted to the edge server where the final section is executed. Here, we introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel. By using natural bottlenecks that already exist in modern DNN architectures, dynamic split computing avoids retraining and hyperparameter optimization, and does not have any negative impact on the final accuracy of DNNs. Through extensive experiments, we show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.
UR - http://www.scopus.com/inward/record.url?scp=85167645970&partnerID=8YFLogxK
U2 - 10.1109/ICASSP49357.2023.10096914
DO - 10.1109/ICASSP49357.2023.10096914
M3 - Article in proceedings
T3 - I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings
BT - ICASSP 2023
PB - IEEE
ER -