CloudCast: A Satellite-Based Dataset and Baseline for Forecasting Clouds

Andreas Holm Nielsen*, Alexandros Iosifidis, Henrik Karstoft

*Corresponding author for this work

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review


Forecasting the formation and development of clouds is a central element of modern weather forecasting systems. Incorrect cloud forecasts can lead to major uncertainty in the overall accuracy of weather forecasts due to their intrinsic role in the Earth's climate system. Few studies have tackled this challenging problem from a machine learning point-of-view due to a shortage of high-resolution datasets with many historical observations globally. In this article, we present a novel satellite-based dataset called “CloudCast.” It consists of 70 080 images with 10 different cloud types for multiple layers of the atmosphere annotated on a pixel level. The spatial resolution of the dataset is 928 × 1530 pixels (3 × 3 km per pixel) with 15-min intervals between frames for the period January 1, 2017 to December 31, 2018. All frames are centered and projected over Europe. To supplement the dataset, we conduct an evaluation study with current state-of-the-art video prediction methods such as convolutional long short-term memory networks, generative adversarial networks, and optical flow-based extrapolation methods. As the evaluation of video prediction is difficult in practice, we aim for a thorough evaluation in the spatial and temporal domain. Our benchmark models show promising results but with ample room for improvement. This is the first publicly available global-scale dataset with high-resolution cloud types on a high temporal granularity to the authors’ best knowledge.
Original languageEnglish
Article number9366908
JournalIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Pages (from-to)3485-3494
Number of pages10
Publication statusPublished - Mar 2021


  • Atmospheric forecasting
  • remote sensing datasets
  • spatiotemporal deep learning

Cite this