View Item 
      •   Home
      • 7. Theses and Dissertations
      • Theses and Dissertations (Submission before August 2018)
      • View Item
      •   Home
      • 7. Theses and Dissertations
      • Theses and Dissertations (Submission before August 2018)
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.
      Subject Lookup

      Browse

      All of DR-NTUCommunities & CollectionsTitlesAuthorsBy DateSubjectsThis CollectionTitlesAuthorsBy DateSubjects

      My Account

      Login

      Statistics

      Most Popular ItemsStatistics by Country/RegionMost Popular Authors

      About DR-NTU

      Distance learning between image and class for object recognition

      Thumbnail
      View/Open
      main_thesis_final_soft.pdf (3.464Mb)
      Author
      Wang, Zhengxiang
      Date of Issue
      2013
      School
      School of Computer Engineering
      Research Centre
      Centre for Multimedia and Network Technology
      Abstract
      Object recognition is an active research topic in the computer vision community. Recently a novel Image-to-Class (I2C) distance has been proposed to handle this problem, which classifies images using a simple Naive-Bayes based nearest-neighbor (NBNN) classifier but provides surprisingly excellent performance. This new distance provides a novel direction that avoids feature quantization and shows better generalization capability than the traditional Image-to-Image (I2I) distance. However, the computation cost of calculating this distance is too expensive since its performance relies heavily on searching the nearest neighbor (NN) from a large number of training features, and the label information of the training data is not fully used, which limits its recognition performance. In this thesis, we aim to improve both the recognition performance and efficiency of this I2C distance as well as to extend its application field. First of all, we add a training phase to this distance for improving its recognition performance by learning a weighted I2C distance. A large margin optimization framework is proposed to learn the I2C distance function, which is modeled as a weighted combination of the distance from every local feature in an image to its NN in a candidate class. We learn these weights associated with local features in the training set by constraining the optimization such that the I2C distance from image to its belonging class should be less than that to any other class. To reduce the computation cost, we also propose two methods based on spatial division and hubness score to accelerate the NN search, which is able to largely reduce the on-line testing time while still preserving or even achieving a better classification accuracy. Secondly, we propose a distance metric learning method to further improve the performance of I2C distance by learning Per-Class Mahalanobis metrics. This Mahalanobis I2C distance is adaptive to different classes by combining with the learned metric for each class. These multiple Per-Class metrics are learned simultaneously by forming a convex optimization problem and solved by an efficient subgradient descent method. For efficiency and scalability to large-scale problems, we also show how to simplify the method to learn a diagonal matrix for each class. Thirdly, we extend the object recognition to the multi-label problem and propose a Class-to-Image (C2I) distance, which shows better performance than the I2C distance for multi-label image classification. However, since the number of local features in a class is huge compared to that in an image, the calculation of the C2I distance is more expensive than the one of I2C distance. Moreover, the label information of training images can be used to help select relevant local features for each class and further improve the recognition performance. Therefore, to make the C2I distance faster and perform better, we propose an optimization algorithm using L_1-norm regularization and large margin constraint to learn the C2I distance, which can not only reduce the number of local features in the class feature set, but also improve the performance of the C2I distance due to the use of label information. We also use this C2I distance for object localization, so that it can tell not only whether a candidate class appears in a test image, but also where it locates. With these three works, we are able to improve the recognition performance and efficiency of the I2C distance and make it applicable for the multi-label problem. Therefore, the learned distance between image and class would be more practical for real world object recognition applications.
      Subject
      DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
      Type
      Thesis
      Collections
      • Theses and Dissertations (Submission before August 2018)
      https://doi.org/10.32657/10356/54819
      Get published version (via Digital Object Identifier)

      Show full item record


      NTU Library, Nanyang Avenue, Singapore 639798 © 2011 Nanyang Technological University. All rights reserved.
      DSpace software copyright © 2002-2015  DuraSpace
      Contact Us | Send Feedback
      Share |    
      Theme by 
      Atmire NV
       

       


      NTU Library, Nanyang Avenue, Singapore 639798 © 2011 Nanyang Technological University. All rights reserved.
      DSpace software copyright © 2002-2015  DuraSpace
      Contact Us | Send Feedback
      Share |    
      Theme by 
      Atmire NV
       

       

      DCSIMG