Publikation: Bidrag til bog/antologi/rapport/proceeding › Konferencebidrag i proceedings › Forskning › peer review
Publikation: Bidrag til bog/antologi/rapport/proceeding › Konferencebidrag i proceedings › Forskning › peer review
}
TY - GEN
T1 - Receptive Field-based Segmentation for Distributed CNN Inference Acceleration in Collaborative Edge Computing
AU - Li, Nan
AU - Iosifidis, Alexandros
AU - Zhang, Qi
N1 - Publisher Copyright: © 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - This paper studies inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing network. To avoid inference accuracy loss in inference task partitioning, we propose receptive field-based segmentation (RFS). To reduce the computation time and communication overhead, we propose a novel collaborative edge computing using fused-layer parallelization to partition a CNN model into multiple blocks of convolutional layers. In this scheme, the collaborative edge servers (ESs) only need to exchange small fraction of the sub-outputs after computing each fused block. In addition, to find the optimal solution of partitioning a CNN model into multiple blocks, we use dynamic programming, named as dynamic programming for fused-layer parallelization (DPFP). The experimental results show that DPFP can accelerate inference of VGG-16 up to 73% compared with the pre-trained model, which outperforms the existing work MoDNN in all tested scenarios. Moreover, we evaluate the service reliability of DPFP under time-variant channel, which shows that DPFP is an effective solution to ensure high service reliability with strict service deadline.
AB - This paper studies inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing network. To avoid inference accuracy loss in inference task partitioning, we propose receptive field-based segmentation (RFS). To reduce the computation time and communication overhead, we propose a novel collaborative edge computing using fused-layer parallelization to partition a CNN model into multiple blocks of convolutional layers. In this scheme, the collaborative edge servers (ESs) only need to exchange small fraction of the sub-outputs after computing each fused block. In addition, to find the optimal solution of partitioning a CNN model into multiple blocks, we use dynamic programming, named as dynamic programming for fused-layer parallelization (DPFP). The experimental results show that DPFP can accelerate inference of VGG-16 up to 73% compared with the pre-trained model, which outperforms the existing work MoDNN in all tested scenarios. Moreover, we evaluate the service reliability of DPFP under time-variant channel, which shows that DPFP is an effective solution to ensure high service reliability with strict service deadline.
KW - Distributed CNNs
KW - Edge computing
KW - Receptive field
KW - Service reliability
KW - Time-critical IoT
U2 - 10.1109/ICC45855.2022.9838547
DO - 10.1109/ICC45855.2022.9838547
M3 - Article in proceedings
SN - 978-1-5386-8348-4
SP - 4281
EP - 4286
BT - ICC 2022 - IEEE International Conference on Communications
PB - IEEE
T2 - IEEE International Conference on Communications
Y2 - 16 May 2022 through 20 May 2022
ER -