TY - JOUR
T1 - A study on optimal input images for rice yield prediction models using CNN with UAV imagery and its reasoning using explainable AI
AU - Yamaguchi, Tomoaki
AU - Takamura, Taiga
AU - Tanaka, Takashi
AU - Ookawa, Taiichiro
AU - Katsura, Keisuke
PY - 2025/3
Y1 - 2025/3
N2 - Rice is the world's most consumed staple food crop and there is a need to increase its yield in terms of food security. Understanding rice yields is important for farmers and national decision-making, and is critical for increasing yields. Remote sensing and machine learning have improved the accuracy and efficiency of yield monitoring. In particular, the combination of unmanned aerial vehicles (UAV) and convolutional neural networks (CNN), which is a type of deep learning, has been studied in recent years owing to its flexibility in data acquisition and high accuracy. Rice yield predictions using UAV and CNN have been reported to build more robust models after the mid-ripening stage. However, optimal input image conditions, such as the growth stage of image acquisition, spectral bands, and image cut-out areas, have not been studied, and there is room for improvement in this respect. In addition, recent efforts to find clues to improve the reliability and accuracy of advanced machine learning models have focused on explainable artificial intelligence (XAI), which attempts to reveal the basis of model inferences. However, there are almost no examples of using XAI for regression tasks with CNN in the research field of agricultural sciences. Therefore, in this study, the optimal input image conditions were investigated for the prediction of rice yield using a CNN based on UAV aerial images collected after the mid-ripening stage. An attempt was made to provide a rationale for the results by visualizing the region of interest in the CNN model. First, using red edge spectral bands at the maturity stage was more effective than at the mid-ripening stage. In addition, higher accuracy was achieved by allowing feature extraction from a slightly wider area than the actual harvested area, especially at the maturity stage. Furthermore, visualization of the region of interest showed that yield prediction was more focused on panicles at the maturity stage. This provided a relevant rationale for optimal input image conditions. In summary, this study identified the optimal input image conditions that enabled yield prediction with higher accuracy. Additionally, using XAI, which visualizes the region of interest, increases the trustworthiness of the model outputs. The results of this study will improve the accuracy and reliability of yield prediction models.
AB - Rice is the world's most consumed staple food crop and there is a need to increase its yield in terms of food security. Understanding rice yields is important for farmers and national decision-making, and is critical for increasing yields. Remote sensing and machine learning have improved the accuracy and efficiency of yield monitoring. In particular, the combination of unmanned aerial vehicles (UAV) and convolutional neural networks (CNN), which is a type of deep learning, has been studied in recent years owing to its flexibility in data acquisition and high accuracy. Rice yield predictions using UAV and CNN have been reported to build more robust models after the mid-ripening stage. However, optimal input image conditions, such as the growth stage of image acquisition, spectral bands, and image cut-out areas, have not been studied, and there is room for improvement in this respect. In addition, recent efforts to find clues to improve the reliability and accuracy of advanced machine learning models have focused on explainable artificial intelligence (XAI), which attempts to reveal the basis of model inferences. However, there are almost no examples of using XAI for regression tasks with CNN in the research field of agricultural sciences. Therefore, in this study, the optimal input image conditions were investigated for the prediction of rice yield using a CNN based on UAV aerial images collected after the mid-ripening stage. An attempt was made to provide a rationale for the results by visualizing the region of interest in the CNN model. First, using red edge spectral bands at the maturity stage was more effective than at the mid-ripening stage. In addition, higher accuracy was achieved by allowing feature extraction from a slightly wider area than the actual harvested area, especially at the maturity stage. Furthermore, visualization of the region of interest showed that yield prediction was more focused on panicles at the maturity stage. This provided a relevant rationale for optimal input image conditions. In summary, this study identified the optimal input image conditions that enabled yield prediction with higher accuracy. Additionally, using XAI, which visualizes the region of interest, increases the trustworthiness of the model outputs. The results of this study will improve the accuracy and reliability of yield prediction models.
KW - Convolutional neural network
KW - Explainable AI
KW - Growth estimation
KW - Image analysis
KW - Rice
KW - UAV
UR - http://www.scopus.com/inward/record.url?scp=85215987847&partnerID=8YFLogxK
U2 - 10.1016/j.eja.2025.127512
DO - 10.1016/j.eja.2025.127512
M3 - Journal article
SN - 1161-0301
VL - 164
JO - European Journal of Agronomy
JF - European Journal of Agronomy
M1 - 127512
ER -