TY - JOUR
T1 - Dynamic Semantic Compression for CNN Inference in Multi-Access Edge Computing
T2 - A Graph Reinforcement Learning-Based Autoencoder
AU - Li, Nan
AU - Iosifidis, Alexandros
AU - Zhang, Qi
N1 - Publisher Copyright:
© 2002-2012 IEEE. All rights reserved.
PY - 2025/3
Y1 - 2025/3
N2 - This paper studies the computational offloading of CNN inference in dynamic multi-access edge computing (MEC) networks. To address the uncertainties in communication time and edge servers’ available capacity, we propose a novel semantic compression method, autoencoder-based CNN architecture (AECNN), for effective semantic extraction and compression in partial offloading. In the semantic encoder, we introduce a feature compression module based on the channel attention mechanism in CNNs, to compress intermediate data by selecting the most informative features. Additionally, to further reduce communication overhead, we leverage entropy encoding to remove the statistical redundancy in the compressed data. In the semantic decoder, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy. To effectively trade-off communication, computation, and inference accuracy, we design a reward function and formulate the offloading problem of CNN inference as a maximization problem with the goal of maximizing the average inference accuracy and throughput over the long term. To address this maximization problem, we propose a graph reinforcement learning-based AECNN (GRL-AECNN) method, which outperforms existing works DROO-AECNN, GRL-BottleNet++ and GRL-DeepJSCC under different dynamic scenarios. This highlights the advantages of GRL-AECNN in offloading decision-making for CNN inference tasks in dynamic MEC.
AB - This paper studies the computational offloading of CNN inference in dynamic multi-access edge computing (MEC) networks. To address the uncertainties in communication time and edge servers’ available capacity, we propose a novel semantic compression method, autoencoder-based CNN architecture (AECNN), for effective semantic extraction and compression in partial offloading. In the semantic encoder, we introduce a feature compression module based on the channel attention mechanism in CNNs, to compress intermediate data by selecting the most informative features. Additionally, to further reduce communication overhead, we leverage entropy encoding to remove the statistical redundancy in the compressed data. In the semantic decoder, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy. To effectively trade-off communication, computation, and inference accuracy, we design a reward function and formulate the offloading problem of CNN inference as a maximization problem with the goal of maximizing the average inference accuracy and throughput over the long term. To address this maximization problem, we propose a graph reinforcement learning-based AECNN (GRL-AECNN) method, which outperforms existing works DROO-AECNN, GRL-BottleNet++ and GRL-DeepJSCC under different dynamic scenarios. This highlights the advantages of GRL-AECNN in offloading decision-making for CNN inference tasks in dynamic MEC.
KW - CNN inference
KW - edge computing
KW - feature compression
KW - graph reinforcement learning
KW - semantic communication
KW - service reliability
UR - http://www.scopus.com/inward/record.url?scp=86000727169&partnerID=8YFLogxK
U2 - 10.1109/TWC.2024.3518399
DO - 10.1109/TWC.2024.3518399
M3 - Journal article
AN - SCOPUS:86000727169
SN - 1536-1276
VL - 24
SP - 2157
EP - 2172
JO - IEEE Transactions on Wireless Communications
JF - IEEE Transactions on Wireless Communications
IS - 3
ER -