TY - JOUR
T1 - Comparing different CT, PET and MRI multi-modality image combinations for deep learning-based head and neck tumor segmentation
AU - Ren, Jintao
AU - Eriksen, Jesper Grau
AU - Nijkamp, Jasper
AU - Korreman, Stine Sofia
PY - 2021/11
Y1 - 2021/11
N2 - Background: Manual delineation of gross tumor volume (GTV) is essential for radiotherapy treatment planning, but it is time-consuming and suffers inter-observer variability (IOV). In clinics, CT, PET, and MRI are used to inform delineation accuracy due to their different complementary characteristics. This study aimed to investigate deep learning to assist GTV delineation in head and neck squamous cell carcinoma (HNSCC) by comparing various modality combinations. Materials and methods: This retrospective study had 153 patients with multiple sites of HNSCC including their planning CT, PET, and MRI (T1-weighted and T2-weighted). Clinical delineations of gross tumor volume (GTV-T) and involved lymph nodes (GTV-N) were collected as the ground truth. The dataset was randomly divided into 92 patients for training, 31 for validation, and 30 for testing. We applied a residual 3 D UNet as the deep learning architecture. We independently trained the UNet with four different modality combinations (CT-PET-MRI, CT-MRI, CT-PET, and PET-MRI). Additionally, analogical to post-processing, an average fusion of three bi-modality combinations (CT-PET, CT-MRI, and PET-MRI) was produced as an ensemble. Segmentation accuracy was evaluated on the test set, using Dice similarity coefficient (Dice), Hausdorff Distance 95 percentile (HD95), and Mean Surface Distance (MSD). Results: All imaging combinations including PET provided similar average scores in range of Dice: 0.72-0.74, HD95: 8.8-9.5 mm, MSD: 2.6-2.8 mm. Only CT-MRI had a lower score with Dice: 0.58, HD95: 12.9 mm, MSD: 3.7 mm. The average of three bi-modality combinations reached Dice: 0.74, HD95: 7.9 mm, MSD: 2.4 mm. Conclusion: Multimodal deep learning-based auto segmentation of HNSCC GTV was demonstrated and inclusion of the PET image was shown to be crucial. Training on combined MRI, PET, and CT data provided limited improvements over CT-PET and PET-MRI. However, when combining three bimodal trained networks into an ensemble, promising improvements were shown.
AB - Background: Manual delineation of gross tumor volume (GTV) is essential for radiotherapy treatment planning, but it is time-consuming and suffers inter-observer variability (IOV). In clinics, CT, PET, and MRI are used to inform delineation accuracy due to their different complementary characteristics. This study aimed to investigate deep learning to assist GTV delineation in head and neck squamous cell carcinoma (HNSCC) by comparing various modality combinations. Materials and methods: This retrospective study had 153 patients with multiple sites of HNSCC including their planning CT, PET, and MRI (T1-weighted and T2-weighted). Clinical delineations of gross tumor volume (GTV-T) and involved lymph nodes (GTV-N) were collected as the ground truth. The dataset was randomly divided into 92 patients for training, 31 for validation, and 30 for testing. We applied a residual 3 D UNet as the deep learning architecture. We independently trained the UNet with four different modality combinations (CT-PET-MRI, CT-MRI, CT-PET, and PET-MRI). Additionally, analogical to post-processing, an average fusion of three bi-modality combinations (CT-PET, CT-MRI, and PET-MRI) was produced as an ensemble. Segmentation accuracy was evaluated on the test set, using Dice similarity coefficient (Dice), Hausdorff Distance 95 percentile (HD95), and Mean Surface Distance (MSD). Results: All imaging combinations including PET provided similar average scores in range of Dice: 0.72-0.74, HD95: 8.8-9.5 mm, MSD: 2.6-2.8 mm. Only CT-MRI had a lower score with Dice: 0.58, HD95: 12.9 mm, MSD: 3.7 mm. The average of three bi-modality combinations reached Dice: 0.74, HD95: 7.9 mm, MSD: 2.4 mm. Conclusion: Multimodal deep learning-based auto segmentation of HNSCC GTV was demonstrated and inclusion of the PET image was shown to be crucial. Training on combined MRI, PET, and CT data provided limited improvements over CT-PET and PET-MRI. However, when combining three bimodal trained networks into an ensemble, promising improvements were shown.
KW - auto-segmentation
KW - CNN
KW - Deep learning
KW - GTV
KW - head and neck cancer
KW - UNet
UR - http://www.scopus.com/inward/record.url?scp=85110860773&partnerID=8YFLogxK
U2 - 10.1080/0284186X.2021.1949034
DO - 10.1080/0284186X.2021.1949034
M3 - Journal article
C2 - 34264157
AN - SCOPUS:85110860773
SN - 0284-186X
VL - 60
SP - 1399
EP - 1406
JO - Acta Oncologica
JF - Acta Oncologica
IS - 11
ER -