Please use this identifier to cite or link to this item:
Title: Hierarchical feature learning for image categorization
Authors: Zuo, Zhen
Keywords: DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Issue Date: 2015
Source: Zuo, Z. (2015). Hierarchical feature learning for image categorization. Doctoral thesis, Nanyang Technological University, Singapore.
Abstract: Extracting informative, robust, and compact data representation (feature) has been considered as one of the key factors for good performance in computer vision, and image categorization is one of the most fundamental computer vision problems. Traditionally, hand-crafted features like SIFT and HOG have been widely used, however, these features cannot adapt to data and need to be well designed. In contrast, feature learning methods have been proposed to encode data-adaptive information in data representation, which have outperformed hand-crafted features with big gaps, and brought a rapid progress in image categorization. The goal of this thesis is to present various feature learning architectures for the problem of object/scene image categorization. In the first part of the thesis, a discriminative hierarchical feature learning framework will be presented for object image categorization. This work aims to learn non-linear transformation matrices to transform image patches to local features. Current features learned by unsupervised learning methods can hardly capture the differences between different classes, which are crucial for object categorization. To capture such information, a discriminative constraint is proposed to force the local feature patches extracted from the same categories to be locally similar, while local feature patches from different classes to be separable. In the second part, discriminative and shareable information will be encoded in features for scene image categorization. Different from object images, scene images do not have clear foreground/background. Some patterns are shared among several classes, some patterns are class-specific. While some patterns represent noisy data, which is not helpful and should be excluded. In order to encode such information, the exemplar based deep discriminative and shareable feature learning framework will be proposed to learn compact filter banks, and hierarchically transfer local image patches to features. In the third part, a class of end-to-end neural networks, called convolutional and hierarchical recurrent neural networks (C-HRNNs), will be presented for large- scale object/scene image categorization. In existing convolutional neural networks (CNNs), both convolution and pooling are locally performed for image regions separately, no contextual dependencies between different image regions have been taken into consideration. Such dependencies represent useful spatial structure information in images. In contrast, recurrent neural networks (RNNs) are well known for their capability of learning contextual dependencies of sequential data by using the recurrent (feed-back) connections. In this work, C-HRNNs aim to encode both spatial and scale dependencies among different image regions to enhance the global discriminative power of image representation. Where CNN layers are firstly processed to generate middle level features. HRNN layers are then processed to learn spatial and scale dependencies.
DOI: 10.32657/10356/65659
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:EEE Theses

Files in This Item:
File Description SizeFormat 
main_thesis_ready.pdfPhD thesis28.14 MBAdobe PDFThumbnail

Google ScholarTM




Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.