Learning with multiple representations : algorithms and applications
Date of Issue2014
School of Computer Engineering
Centre for Multimedia and Network Technology
Recently, lots of visual representations have been developed for computer vision applications. As different types of visual representations may reflect different kinds of information about the original data, their differentiation ability may vary greatly. As the existing machine learning algorithms are mostly based on the single data representation, it becomes more and more important to develop machine learning algorithms for tackling data with multiple representations. Therefore, in this thesis we study the problem of learning with multiple representations. We develop several novel algorithms to tackle data with multiple representations under three different learning scenarios, and we apply the proposed algorithms to a few computer vision applications. Specifically, we first study the learning with multiple kernels under fully supervised setting. Based on a hard margin perspective for the dual form of the traditional ℓ1-norm Multiple Kernel Learning (MKL), we introduce a new “kernel slack variable” and propose a Soft Margin framework for Multiple Kernel Learning (SMMKL). By incorporating the hinge loss for kernel slack variables, a new box constraint for the kernel coefficients is introduced for Multiple Kernel Learning. The square hinge loss and the square loss soft margin MKLs naturally incorporate the family of elastic-net MKL and ℓ2MKL, respectively. We demonstrate the effectiveness of our proposed algorithms on benchmark data sets as well as several computer vision data sets. Second, we study the learning with multiple kernels for weakly labeled data. Based on “input-output kernels”, we propose a unified Input-output Kernel Learning (IOKL) framework for handling weakly labeled data with multiple representations. Under this framework, the general data ambiguity problems such as SSL, MIL and clustering with multiple representations are solved in a unified framework. We formulate the learning problem as a group sparse MKL problem to incorporate the intrinsic group structure among the input-output kernels. A group sparse soft margin regularization is further developed to improve the performance. The promising experimental results on the challenging NUS-WIDE dataset for a computer vision application (i.e., text-based image retrieval), SSL benchmark datasets and MIL benchmark datasets demonstrate the effectiveness of our proposed IOKL framework. Third, we study the learning with privileged information for distance metric learning, where the distance metric is learnt with extra privileged information which is available only in the training data but unavailable in the test data. We propose a novel method called Information-theoretic Metric Learning with Privileged Information (ITML+) to model the learning scenario. An efficient cyclical projection method based on analytical solutions for all the variables is also developed to solve the new objective function. The proposed algorithm is applied to face verification and person re-identification in RGB images by learning from the RGB-D data. The extensive experiments are conducted on the real-world EUROCOM, CurtinFaces and BIWI RGBD-ID datasets and the results demonstrate the effectiveness of our newly proposed ITML+ algorithm.
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition