Abstract
Recent advances in cross-modal multimedia data analysis necessarily require efficient similarity search on the scales of hundreds of millions of high-dimensional vectors. We address this task by proposing the CRANBERRY algorithm that specifically combines and tunes several existing similarity search strategies. In particular, the algorithm: (1) employs the Voronoi partitioning to obtain a query-relevant candidate set in constant time, (2) applies filtering techniques to prune the obtained candidates significantly, and (3) re-rank the retained candidate vectors with respect to the query vector. Applied to the dataset of 100 million 768-dimensional vectors, the algorithm evaluates 10NN queries with 90\,\% recall and query latency of 1.2\,s on average, all with a throughput of 15 queries per second on a server with 56 core-CPU, and 4.7 q/sec. on a PC.
Originalsprog | Engelsk |
---|---|
Titel | Similarity Search and Applications - 16th International Conference, SISAP 2023, Proceedings |
Redaktører | Oscar Pedreira, Vladimir Estivill-Castro |
Antal sider | 9 |
Udgivelsessted | Cham |
Forlag | Springer |
Publikationsdato | 2023 |
Sider | 300-308 |
ISBN (Trykt) | 978-3-031-46993-0 |
ISBN (Elektronisk) | 978-3-031-46994-7 |
DOI | |
Status | Udgivet - 2023 |
Begivenhed | 16th International Conference on Similarity Search and Applications - Universidade da Coruña, A Coruña, Spanien Varighed: 9 okt. 2023 → 11 okt. 2023 Konferencens nummer: 16 |
Konference
Konference | 16th International Conference on Similarity Search and Applications |
---|---|
Nummer | 16 |
Lokation | Universidade da Coruña |
Land/Område | Spanien |
By | A Coruña |
Periode | 09/10/2023 → 11/10/2023 |
Navn | Lecture Notes in Computer Science |
---|---|
Vol/bind | 14289 |
ISSN | 0302-9743 |