Alexandria: A Proof-of-concept Implementation and Evaluation of Generalised Data Deduplication

Lars Nielsen, Rasmus Vestergaard, Niloofar Yazdani, Siva Rama Krishna Prasad Talasila, Daniel Enrique Lucani Rötter, Marton Sipos

    Publikation: Bidrag til bog/antologi/rapport/proceedingKonferencebidrag i proceedingsForskningpeer review

    237 Downloads (Pure)


    The amount of data generated worldwide is expected to grow from 33 to 175 ZB by 2025 in part driven by the growth of Internet of Things (IoT) and cyber-physical systems (CPS). To cope with this enormous amount of data, new cloud storage techniques must be developed. Generalised Data Deduplication (GDD) is a new paradigm for reducing the cost of storage by systematically identifying near identical data chunks, storing their common component once, and a compact representation of the deviation to the original chunk for each chunk. This paper presents a system architecture for GDD and a proof-of-concept implementation. We evaluated the compression gain of Generalised Data Deduplication using three data sets of varying size and content and compared to the performance of the EXT4 and ZFS file systems, where the latter employs classic deduplication. We show that Generalised Data Deduplication provide up to 16.75% compression gain compared to both EXT4 and ZFS with data sets with less than 5 GB of data.
    Titel2019 IEEE Globecom Workshops, GC Wkshps 2019 - Proceedings
    ISBN (Elektronisk)978-1-7281-0960-2
    StatusUdgivet - 2019
    Begivenhed2019 IEEE Globecom Workshops -
    Varighed: 9 dec. 201913 dec. 2019


    Konference2019 IEEE Globecom Workshops


    Dyk ned i forskningsemnerne om 'Alexandria: A Proof-of-concept Implementation and Evaluation of Generalised Data Deduplication'. Sammen danner de et unikt fingeraftryk.
    • Scale-loT

      Lucani Rötter, D. E.


      Projekter: ProjektForskning

    • Starting Grant

      Lucani Rötter, D. E.

      Starting Grant


      Projekter: ProjektForskning