Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/165975
Title: Visual localization at NTU campus
Authors: Abhinaya, Kesarimangalam Srinivasan
Keywords: Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Issue Date: 2023
Publisher: Nanyang Technological University
Source: Abhinaya, K. S. (2023). Visual localization at NTU campus. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/165975
Abstract: Visual localization is a key problem in various computer vision applications such as augmented reality and autonomous driving. Major challenges for visual localization include varying weather conditions, dynamic foregrounds, and varying viewpoints as seen in environments with dynamic objects such as the Nanyang Technological University Campus. Some efficient methods to represent images for the Visual Place Recognition task like Fischer Vectors (FV), Scale-Invariant Feature Transform (SIFT), and Vector of Locally Aggregated Descriptors (VLAD) can handle some of these challenges. Although VLAD provides a rich and effective method for image storage and retrieval, it models a static function. NetVLAD modifies the same to create a trainable function, that minimizes the Euclidean distance between the query and the correct positive image and is used as baseline in this work. Soft assignment to clusters makes NetVLAD readily pluggable into Convolutional Neural Network architectures for end - to - end training. Instead of uniform pooling as in the case of NetVLAD, Attention Pyramid Pooling of Salient Visual Residuals (APPSVR) uses attention, generated based on semantic segmentation, to de-prioritize task irrelevant features. Three levels of attention in the form of local integration, global integration and parametric pooling handle the cases of task - irrelevant features, contextual information and weighting between clusters respectively. This paper aims to study the effect of semantic segmentation in visual localization; NetVLAD and APPVSR as potential solutions for visual localization in an indoor location like the Nanyang Technological University (NTU) Campus. Utilizing semantic information to generate attention has shown to be helpful with an increase in Recall@1 rates from 0.8381 to 0.8563.
URI: https://hdl.handle.net/10356/165975
Schools: School of Computer Science and Engineering 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
Kesarimangalam-Srinivasan-Abhinaya_SCSE22-0277_Amended Report.pdf
  Restricted Access
13.11 MBAdobe PDFView/Open

Page view(s)

201
Updated on May 7, 2025

Download(s) 50

23
Updated on May 7, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.