Geometric hashing for camera based localization
Tan, Wei Chian
Date of Issue2012
School of Computer Engineering
Centre for Multimedia and Network Technology
The most popular localization system now, Global Positioning System (GPS), is well known for its limitation in high rise urban areas due to difficulty in establishing line of sight to multiple GPS satellites. A vision or camera based localization system is an interesting alternative to consider for localization. A camera based localization system has been established previously , assuming that the only available prior information is a two-dimensional (2D) plan view of a city region. Given a query image taken in the same city region, the basic approach of localization is to establish correspondence between the query image and the 2D map, based on a new feature called Vertical Corner Line Hypothesis (VCLH), hypothesis of a vertical building corner in an image. A VCLH is characterized by position of the vertical line or building corner in the image, and orientations of the neighbouring plane normal. The set of VCLHs extracted from the input image is called VCLH signature. Matching is performed by identifying the camera position on the 2D map with the closest VCLH signature to the input image one using Random Sample Consensus (RANSAC) . However, the search for best camera location is computationally expensive. Hence, this project aims to develop a speedup framework to solve the problem. Geometric Hashing , hashing based on geometric information such as keypoints invariant to translation and rotation, is employed in this project. A Geometric Hashing system based on VCLH is developed. The system is similar to the original framework. During pre-processing, VCLH signature obtained from each camera pose in the 2D map is quantized into bins and inserted into a two dimensional (2D) hash table. During online retrieval, the query signature is quantized and a voting mechanism is used to identify good candidates (camera pose) for further verification. Experiments show that significant speed gains have been obtained through the system, from direct search of ten minutes to removal of large pool of candidates within ten seconds in our Matlab  implementation. However, the performance is not good, none of the shortlists returned for each query location include the ground truth, both with and without Geometric Hashing. This is mainly due to poor detection results of VCLH feature. Next, the concepts of connectivity and omnidirectional views are introduced to the system and incorporated into a feature known as Structural Fragments (SF). Connectivity refers to a plane facade between two VCLHs and corresponds to a straight line in 2D plan view. An SF is characterized by a set of VCLHs (points in 2D plan view) and relevant connectivity information. During pre-processing, outline of each building in the 2D map is quantized into bins by taking any two building corners as the basis to establish a coordinate system. The process is repeated for all possible pairs of basis. During online retrieval, given an SF, similar coordinate system establishment procedures are taken and a voting mechanism is used to retrieve close candidates. This avoids the need of comparing the query SF to each possible SF in the 2D map (linear to sublinear search improvement). In addition, SF provides further speed gain because the framework no longer requires dense sampling of camera viewpointsduring pre-processing. Experiments show a significant improvement on accuracy, with much greater speed gains. Uniqueness test was also carried out to investigate how discriminative the signatures are theoretically. Experiments reveal that SF signatures are surprisingly unique. In short, the aim of speeding up has been achieved. The Geometric Hashing system developed gives good performance. However, there are still some problems with both VCLH and SF detection. Next step could be looking into improvement of feature detection or more advanced speeding up techniques such as Locality Sensitive Hashing (LSH)  and Randomized Tree  or Forest  for better performance.
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision