Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/181650
Title: Efficient approximate nearest neighbor search on high-dimensional vectors by graph and quantization
Authors: Gou, Yutong
Keywords: Computer and Information Science
Issue Date: 2024
Publisher: Nanyang Technological University
Source: Gou, Y. (2024). Efficient approximate nearest neighbor search on high-dimensional vectors by graph and quantization. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181650
Abstract: Approximate nearest neighbor (ANN) search in high-dimensional Euclidean space has a broad range of applications. Among existing ANN algorithms, graph-based methods currently show superior performance in terms of the time-accuracy trade-off. Yet, they have performance bottlenecks due to the random memory accesses caused by the searching process on the graph index and the cost of computing exact distances between high-dimensional vectors. To relieve the aforementioned issues, a recent method named NGT-QG makes an initial attempt by integrating quantization and graph. It (1) replicates and stores the quantization codes of a vertex's neighbors compactly so that they can be accessed sequentially; and (2) uses a SIMD-based implementation named FastScan to efficiently estimate distances based on the quantization codes in batch for guiding the searching process. While NGT-QG achieves promising improvements over the vanilla graph-based methods, it has not fully unleashed the potential of integrating quantization and graph. For instance, it entails a re-ranking step to compute exact distances at the end, which introduces extra random memory accesses; its graph structure is not jointly designed considering the in-batch nature of FastScan, which causes wastes of computation in searching. In this work, we present a new method named SymphonyQG, which targets more symphonious integration of quantization and graph (e.g., it avoids an explicit re-ranking step and refines the graph structure to be more aligned with FastScan). Based on our experimental results on several real-world datasets, SymphonyQG establishes a new state of the art in terms of the query performance.
URI: https://hdl.handle.net/10356/181650
DOI: 10.32657/10356/181650
Schools: College of Computing and Data Science 
Rights: This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:CCDS Theses

Files in This Item:
File Description SizeFormat 
MENG_Thesis.pdf5.27 MBAdobe PDFView/Open

Page view(s)

262
Updated on May 7, 2025

Download(s) 20

346
Updated on May 7, 2025

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.