Aarhus University Seal

cleanTS: Automated (AutoML) tool to clean univariate time series at microscales

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review



  • Mayur Kishor Shende, Defence Institute of Advanced Technology (DIAT)
  • ,
  • Andres E. Feijoo-Lorenzo, University of Vigo
  • ,
  • Neeraj Dhanraj Bokde

Data cleaning is one of the most important tasks in data analysis processes. One of the perennial challenges in data analytics is the detection and handling of non-valid data. Failing to do so can result in creating imbalanced observations that can cause bias and influence estimates, and in extreme cases, can even lead to inaccurate analytics and unreliable decisions. Usually, the process of data cleaning is time-consuming due to its growing volume, velocity, and variety. Further, the complexity and difficulty of the cleaning process increase with the amount of data to be analyzed. It is rarely the case that any real world data is clean and error-free. Thus, pre-processing the data before using it for analysis has become standard practice. This paper is intended to provide an easy-to-use and reliable system which automates the cleaning process for univariate time series data. Also, automating the process reduces the time required for cleaning it. Another issue that the proposed system aims to solve is making the visualization of a large amount of data more effective. To tackle these issues, an R package, cleanTS is proposed. The proposed system provides a way to analyze data on different scales and resolutions. Also, it provides users with tools and a benchmark system for comparing various techniques used in data cleaning.(c) 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Original languageEnglish
Pages (from-to)155-176
Number of pages22
Publication statusPublished - Aug 2022

    Research areas

  • Time series analysis, Time series cleaning, Data cleaning, AutoML, Machine learning, MISSING VALUE IMPUTATION, BUSINESS INTELLIGENCE, BIG DATA, R PACKAGE, ANALYTICS, STATE

See relations at Aarhus University Citationformats

ID: 272935696