Generalized Deduplication: Lossless Compression for Large Amounts of Small IoT Data

    Research output: Contribution to book/anthology/report/proceedingArticle in proceedingsResearchpeer-review

    515 Downloads (Pure)


    We show that a generalization of deduplication can enable compressed storage of sensor data. The method uses error- correcting codes in a non-traditional manner to identify similar elements, and then leverages this similarity for compression. Using Reed Solomon codes, our method has a theoretical potential to reduce the cost of storing chunks of 16 bytes to as much as 5 times less, and up to 65 times less for chunks of 255 bytes. We define a simple model for sensor data, and show how our approach is able to compress data from the model, realizing its compression potential with much smaller data sets than classic deduplication requires. This demonstrates that generalized deduplication can be a viable solution for practical lossless compression of small IoT data in scenarios where classic deduplication is ineffective.
    Original languageEnglish
    Title of host publicationEuropean Wireless Conference
    Number of pages5
    PublisherVDE Verlag GmbH
    Publication date2019
    ISBN (Print)978-3-8007-4948-5
    Publication statusPublished - 2019
    EventEuropean Wireless 2019 - 25th European Wireless Conference - Aarhus, Denmark
    Duration: 2 May 20195 May 2019


    ConferenceEuropean Wireless 2019 - 25th European Wireless Conference
    Internet address


    Dive into the research topics of 'Generalized Deduplication: Lossless Compression for Large Amounts of Small IoT Data'. Together they form a unique fingerprint.

    Cite this