TY - JOUR
T1 - Artificial intelligence uncertainty quantification in radiotherapy applications − A scoping review
AU - Wahid, Kareem A.
AU - Kaffey, Zaphanlene Y.
AU - Farris, David P.
AU - Humbert-Vidan, Laia
AU - Moreno, Amy C.
AU - Rasmussen, Mathis
AU - Ren, Jintao
AU - Naser, Mohamed A.
AU - Netherton, Tucker J.
AU - Korreman, Stine
AU - Balakrishnan, Guha
AU - Fuller, Clifton D.
AU - Fuentes, David
AU - Dohopolski, Michael J.
N1 - Publisher Copyright:
© 2024 The Author(s)
PY - 2024/12
Y1 - 2024/12
N2 - Background/purpose: The use of artificial intelligence (AI) in radiotherapy (RT) is expanding rapidly. However, there exists a notable lack of clinician trust in AI models, underscoring the need for effective uncertainty quantification (UQ) methods. The purpose of this study was to scope existing literature related to UQ in RT, identify areas of improvement, and determine future directions. Methods: We followed the PRISMA-ScR scoping review reporting guidelines. We utilized the population (human cancer patients), concept (utilization of AI UQ), context (radiotherapy applications) framework to structure our search and screening process. We conducted a systematic search spanning seven databases, supplemented by manual curation, up to January 2024. Our search yielded a total of 8980 articles for initial review. Manuscript screening and data extraction was performed in Covidence. Data extraction categories included general study characteristics, RT characteristics, AI characteristics, and UQ characteristics. Results: We identified 56 articles published from 2015 to 2024. 10 domains of RT applications were represented; most studies evaluated auto-contouring (50 %), followed by image-synthesis (13 %), and multiple applications simultaneously (11 %). 12 disease sites were represented, with head and neck cancer being the most common disease site independent of application space (32 %). Imaging data was used in 91 % of studies, while only 13 % incorporated RT dose information. Most studies focused on failure detection as the main application of UQ (60 %), with Monte Carlo dropout being the most commonly implemented UQ method (32 %) followed by ensembling (16 %). 55 % of studies did not share code or datasets. Conclusion: Our review revealed a lack of diversity in UQ for RT applications beyond auto-contouring. Moreover, we identified a clear need to study additional UQ methods, such as conformal prediction. Our results may incentivize the development of guidelines for reporting and implementation of UQ in RT.
AB - Background/purpose: The use of artificial intelligence (AI) in radiotherapy (RT) is expanding rapidly. However, there exists a notable lack of clinician trust in AI models, underscoring the need for effective uncertainty quantification (UQ) methods. The purpose of this study was to scope existing literature related to UQ in RT, identify areas of improvement, and determine future directions. Methods: We followed the PRISMA-ScR scoping review reporting guidelines. We utilized the population (human cancer patients), concept (utilization of AI UQ), context (radiotherapy applications) framework to structure our search and screening process. We conducted a systematic search spanning seven databases, supplemented by manual curation, up to January 2024. Our search yielded a total of 8980 articles for initial review. Manuscript screening and data extraction was performed in Covidence. Data extraction categories included general study characteristics, RT characteristics, AI characteristics, and UQ characteristics. Results: We identified 56 articles published from 2015 to 2024. 10 domains of RT applications were represented; most studies evaluated auto-contouring (50 %), followed by image-synthesis (13 %), and multiple applications simultaneously (11 %). 12 disease sites were represented, with head and neck cancer being the most common disease site independent of application space (32 %). Imaging data was used in 91 % of studies, while only 13 % incorporated RT dose information. Most studies focused on failure detection as the main application of UQ (60 %), with Monte Carlo dropout being the most commonly implemented UQ method (32 %) followed by ensembling (16 %). 55 % of studies did not share code or datasets. Conclusion: Our review revealed a lack of diversity in UQ for RT applications beyond auto-contouring. Moreover, we identified a clear need to study additional UQ methods, such as conformal prediction. Our results may incentivize the development of guidelines for reporting and implementation of UQ in RT.
UR - http://www.scopus.com/inward/record.url?scp=85204457214&partnerID=8YFLogxK
U2 - 10.1016/j.radonc.2024.110542
DO - 10.1016/j.radonc.2024.110542
M3 - Review
C2 - 39299574
AN - SCOPUS:85204457214
SN - 0167-8140
VL - 201
JO - Radiotherapy and Oncology
JF - Radiotherapy and Oncology
M1 - 110542
ER -