Distributed Deep Learning Inference Acceleration using Seamless Collaboration in Edge Computing

Nan Li*, Alexandros Iosifidis, Qi Zhang

*Corresponding author for this work

Research output: Contribution to book/anthology/report/proceedingArticle in proceedingsResearchpeer-review

Abstract

This paper studies inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing. To ensure inference accuracy in inference task partitioning, we consider the receptive-field when performing segment-based partitioning. To maximize the parallelization between the communication and computing processes, thereby minimizing the total inference time of an inference task, we design a novel task collaboration scheme in which the overlapping zone of the sub-tasks on secondary edge servers (ESs) is executed on the host ES, named as HALP. We further extend HALP to the scenario of multiple tasks. Experimental results show that HALP can accelerate CNN inference in VGG-16 by 1.7-2.0x for a single task and 1.7-1.8x for 4 tasks per batch on GTX 1080TI and JETSON AGX Xavier, which outperforms the state-of-the-art work MoDNN. Moreover, we evaluate the service reliability under time-variant channel, which shows that HALP is an effective solution to ensure high service reliability with strict service deadline.

Original languageEnglish
Title of host publicationICC 2022 - IEEE International Conference on Communications
Number of pages6
PublisherIEEE
Publication dateMay 2022
Pages3667-3672
ISBN (Electronic)978-1-5386-8347-7
DOIs
Publication statusPublished - May 2022
EventIEEE International Conference on Communications - Coex, Seoul, Korea, Republic of
Duration: 16 May 202220 May 2022
https://icc2022.ieee-icc.org/

Conference

ConferenceIEEE International Conference on Communications
LocationCoex
Country/TerritoryKorea, Republic of
CitySeoul
Period16/05/202220/05/2022
Internet address

Keywords

  • Delay constraint
  • Distributed CNNs
  • Edge computing
  • Inference acceleration
  • Receptive-field
  • Service reliability

Fingerprint

Dive into the research topics of 'Distributed Deep Learning Inference Acceleration using Seamless Collaboration in Edge Computing'. Together they form a unique fingerprint.

Cite this