CRANBERRY: Memory-Effective Search in 100M High-Dimensional CLIP Vectors

Vladimir Mic*, Jan Sedmidubsky, Pavel Zezula

*Corresponding author af dette arbejde

Publikation: Bidrag til bog/antologi/rapport/proceedingKonferencebidrag i proceedingsForskningpeer review

1 Citationer (Scopus)

Abstract

Recent advances in cross-modal multimedia data analysis necessarily require efficient similarity search on the scales of hundreds of millions of high-dimensional vectors. We address this task by proposing the CRANBERRY algorithm that specifically combines and tunes several existing similarity search strategies. In particular, the algorithm: (1) employs the Voronoi partitioning to obtain a query-relevant candidate set in constant time, (2) applies filtering techniques to prune the obtained candidates significantly, and (3) re-rank the retained candidate vectors with respect to the query vector. Applied to the dataset of 100 million 768-dimensional vectors, the algorithm evaluates 10NN queries with 90\,\% recall and query latency of 1.2\,s on average, all with a throughput of 15 queries per second on a server with 56 core-CPU, and 4.7 q/sec. on a PC.
OriginalsprogEngelsk
TitelSimilarity Search and Applications - 16th International Conference, SISAP 2023, Proceedings
RedaktørerOscar Pedreira, Vladimir Estivill-Castro
Antal sider9
UdgivelsesstedCham
ForlagSpringer
Publikationsdato2023
Sider300-308
ISBN (Trykt)978-3-031-46993-0
ISBN (Elektronisk)978-3-031-46994-7
DOI
StatusUdgivet - 2023
Begivenhed16th International Conference on Similarity Search and Applications - Universidade da Coruña, A Coruña, Spanien
Varighed: 9 okt. 202311 okt. 2023
Konferencens nummer: 16

Konference

Konference16th International Conference on Similarity Search and Applications
Nummer16
LokationUniversidade da Coruña
Land/OmrådeSpanien
ByA Coruña
Periode09/10/202311/10/2023
NavnLecture Notes in Computer Science
Vol/bind14289
ISSN0302-9743

Fingeraftryk

Dyk ned i forskningsemnerne om 'CRANBERRY: Memory-Effective Search in 100M High-Dimensional CLIP Vectors'. Sammen danner de et unikt fingeraftryk.

Citationsformater