TY - JOUR
T1 - ESTMST-ST
T2 - An End-to-End Soft Threshold and Multiloss Self-Distillation Based Swin Transformer for Underwater Acoustic Signal Recognition
AU - Fan, Wu
AU - Haiyang, Yao
AU - Zhongda, Zhao
AU - Xiaobo, Zhao
AU - Yuzhang, Zang
AU - Haiyan, Wang
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Underwater acoustic signal recognition (UASR) is significant for marine life and ecological environment protection. However, 2-D fixed-parameter inputs are inadequate for adapting to the variable underwater acoustic environment, and learnable-parameter inputs with inductive bias priors in transformers lead to difficulties in model convergence. Additionally, the differing optimization objectives between noise reduction and recognition methods can cause signal distortion, hindering recognition accuracy. To address these issues, this article proposes an innovative end-to-end soft threshold Swin Transformer model based on a multiloss self-distillation training strategy (ESTMST-ST) for robust recognition of weak underwater acoustic targets. Building on the previously proposed time-frequency Swin Transformer (TFST), we design a learnable dual filter module (LDFM) that decomposes underwater acoustic signals in the frequency direction, with parameters obtained through model training. To improve the model's antinoise performance, we incorporate a soft threshold strategy within TFST to reduce nonstationary interference in underwater acoustic signals. For enhanced robustness and training efficiency, we introduce a self-distillation training strategy with four specific loss functions in selected stage in TFST. Using publicly available datasets, ShipsEar and DeepShip, we conduct three experiments: fixed signal-to-noise ratio (SNR) UASR, multi-SNR UASR, and model generalization ability tests. The experimental results demonstrate that ESTMST-ST achieves superior performance (at least a 1.6 improvement in F scores and a 2.2 improvement in kappa coefficients) compared to five state-of-the-art methods across two open-source datasets.
AB - Underwater acoustic signal recognition (UASR) is significant for marine life and ecological environment protection. However, 2-D fixed-parameter inputs are inadequate for adapting to the variable underwater acoustic environment, and learnable-parameter inputs with inductive bias priors in transformers lead to difficulties in model convergence. Additionally, the differing optimization objectives between noise reduction and recognition methods can cause signal distortion, hindering recognition accuracy. To address these issues, this article proposes an innovative end-to-end soft threshold Swin Transformer model based on a multiloss self-distillation training strategy (ESTMST-ST) for robust recognition of weak underwater acoustic targets. Building on the previously proposed time-frequency Swin Transformer (TFST), we design a learnable dual filter module (LDFM) that decomposes underwater acoustic signals in the frequency direction, with parameters obtained through model training. To improve the model's antinoise performance, we incorporate a soft threshold strategy within TFST to reduce nonstationary interference in underwater acoustic signals. For enhanced robustness and training efficiency, we introduce a self-distillation training strategy with four specific loss functions in selected stage in TFST. Using publicly available datasets, ShipsEar and DeepShip, we conduct three experiments: fixed signal-to-noise ratio (SNR) UASR, multi-SNR UASR, and model generalization ability tests. The experimental results demonstrate that ESTMST-ST achieves superior performance (at least a 1.6 improvement in F scores and a 2.2 improvement in kappa coefficients) compared to five state-of-the-art methods across two open-source datasets.
KW - Learnable dual filter
KW - self-knowledge distillation (SKD)
KW - soft threshold
KW - Swin Transformer
UR - http://www.scopus.com/inward/record.url?scp=85212845347&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2024.3520860
DO - 10.1109/TGRS.2024.3520860
M3 - Journal article
AN - SCOPUS:85212845347
SN - 0196-2892
VL - 63
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 4200813
ER -