Comparing different CT, PET and MRI multi-modality image combinations for deep learning-based head and neck tumor segmentation

Jintao Ren, Jesper Grau Eriksen, Jasper Nijkamp, Stine Sofia Korreman*

*Corresponding author af dette arbejde

Publikation: Bidrag til tidsskrift/Konferencebidrag i tidsskrift /Bidrag til avisTidsskriftartikelForskningpeer review

Abstract

Background: Manual delineation of gross tumor volume (GTV) is essential for radiotherapy treatment planning, but it is time-consuming and suffers inter-observer variability (IOV). In clinics, CT, PET, and MRI are used to inform delineation accuracy due to their different complementary characteristics. This study aimed to investigate deep learning to assist GTV delineation in head and neck squamous cell carcinoma (HNSCC) by comparing various modality combinations. Materials and methods: This retrospective study had 153 patients with multiple sites of HNSCC including their planning CT, PET, and MRI (T1-weighted and T2-weighted). Clinical delineations of gross tumor volume (GTV-T) and involved lymph nodes (GTV-N) were collected as the ground truth. The dataset was randomly divided into 92 patients for training, 31 for validation, and 30 for testing. We applied a residual 3 D UNet as the deep learning architecture. We independently trained the UNet with four different modality combinations (CT-PET-MRI, CT-MRI, CT-PET, and PET-MRI). Additionally, analogical to post-processing, an average fusion of three bi-modality combinations (CT-PET, CT-MRI, and PET-MRI) was produced as an ensemble. Segmentation accuracy was evaluated on the test set, using Dice similarity coefficient (Dice), Hausdorff Distance 95 percentile (HD95), and Mean Surface Distance (MSD). Results: All imaging combinations including PET provided similar average scores in range of Dice: 0.72-0.74, HD95: 8.8-9.5 mm, MSD: 2.6-2.8 mm. Only CT-MRI had a lower score with Dice: 0.58, HD95: 12.9 mm, MSD: 3.7 mm. The average of three bi-modality combinations reached Dice: 0.74, HD95: 7.9 mm, MSD: 2.4 mm. Conclusion: Multimodal deep learning-based auto segmentation of HNSCC GTV was demonstrated and inclusion of the PET image was shown to be crucial. Training on combined MRI, PET, and CT data provided limited improvements over CT-PET and PET-MRI. However, when combining three bimodal trained networks into an ensemble, promising improvements were shown.

OriginalsprogEngelsk
TidsskriftActa Oncologica
Vol/bind60
Nummer11
Sider (fra-til)1399-1406
Antal sider8
ISSN0284-186X
DOI
StatusUdgivet - nov. 2021

Fingeraftryk

Dyk ned i forskningsemnerne om 'Comparing different CT, PET and MRI multi-modality image combinations for deep learning-based head and neck tumor segmentation'. Sammen danner de et unikt fingeraftryk.

Citationsformater