Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/152805
Title: Moving towards centers : re-ranking with attention and memory for re-identification
Authors: Zhou, Yunhao
Keywords: Engineering::Electrical and electronic engineering
Issue Date: 2021
Publisher: Nanyang Technological University
Source: Zhou, Y. (2021). Moving towards centers : re-ranking with attention and memory for re-identification. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/152805
Abstract: Re-Identification (Re-ID) is a fundamental computer vision task, which refers to associating targets, such as humans or vehicles, captured from multiple non-overlapping cameras. After obtaining the initial re-ID result, re-ranking boosts the retrieval performance with contextual information in top-ranked samples. Current re-ranking approaches focus on hand-crafted rules, which generalize well on small re-ID benchmarks. However, they cannot handle complex relationships between the probe image and the retrieved samples. This inherent deficiency leads to unsatisfying results when dealing with massive data, which is unavoidable for real-world scenarios. To eliminate the reliance on polishing hand-designed algorithms, this work proposed a deep learning-based re-ranking network to predict the correlations between images and their local neighbors. Specifically, all the feature embeddings of query and gallery images are expanded and enhanced by a linear combination of their neighbors, with the correlation prediction serves as discriminative combination weights. The combination process is equivalent to moving independent embeddings toward the identity centers, improving cluster compactness. For correlation prediction, we first aggregate the contextual information for probe’s k-nearest neighbors via the Transformer encoder. Then, we distill and refine the probe-related features into the Contextual Memory cell via attention mechanism. Like humans that retrieve images by not only considering probe images but also memorizing the retrieved ones, the Contextual Memory produces multi-view descriptions for each instance. Finally, the neighbors are reconstructed with features fetched from the Contextual Memory, and a binary classifier predicts their correlations with the probe. Experiments on six widely-used person and vehicle re-ID benchmarks demonstrate the effectiveness of the proposed method. Especially, our method surpasses the state-of-the-art re-ranking approaches on large-scale datasets by a significant margin, i.e., with an average 3.08% CMC@1 and 7.46% mAP improvements on VERI-Wild, MSMT17, and VehicleID datasets.
URI: https://hdl.handle.net/10356/152805
DOI: 10.32657/10356/152805
Rights: This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:EEE Theses

Files in This Item:
File Description SizeFormat 
Moving Towards Centers.pdf4.2 MBAdobe PDFView/Open

Page view(s)

66
Updated on Dec 5, 2021

Download(s) 50

23
Updated on Dec 5, 2021

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.