Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/62186
Full metadata record
DC FieldValueLanguage
dc.contributor.authorLi, Wenen
dc.date.accessioned2015-02-25T03:15:39Zen
dc.date.available2015-02-25T03:15:39Zen
dc.date.copyright2014en
dc.date.issued2014en
dc.identifier.citationLi, W. (2014). Visual recognition by learning from web data. Doctoral thesis, Nanyang Technological University, Singapore.en
dc.identifier.urihttps://hdl.handle.net/10356/62186en
dc.description.abstractWith the rapid development of digital cameras, we have witnessed an explosive growth of digital images. Every day, a tremendous amount of images together with rich contextual information (e.g., tags, categories and captions) are posted to Internet. There is an increasing interest in exploiting those web images for building intelligent visual recognition systems. While some works have been proposed to collect large scale image datasets by crawling web images from Internet, considerable human efforts are still required to annotate those images to train classifiers for visual recognition. In this thesis, we propose to develop novel learning algorithms for visual recognition by learning from web data, in which we aim to use as less as possible human efforts for annotating the training data. First, considering that web images are usually associated with noisy surrounding textual descriptions, we treat the words in the surrounding text as weak labels and formulate the task of learning from web data as a multi-instance learning (MIL) problem. By observing the relevant images usually contain many true positive images, we generalize the traditional MIL constraints on positive bags to that each positive bag contains at least a portion of positive instances. To effectively exploit such constraints on positive bags, we develop a new MIL algorithm called MIL with constrained positive bags (MIL-CPB) for web image retrieval. Observing that the constraints are not always satisfied in the MIL-CPB, a progressive scheme is proposed to further improve the retrieval performance, in which we iteratively partition the top-ranked training web images from the current MIL-CPB classifier to construct more confident positive bags and then use these new bags as training data to learn the subsequent MIL-CPB classifiers. Second, when the web training data is represented with multiple views of features, we further propose a co-labeling approach to improve the classifiers learnt from web data by using multiple views of features. We model the learning problem on each view as a weakly labeled learning problem, and use the predicted training labels from the classifier trained on one view to help the classifier on another view. Our co-labeling approach not only can handle the traditional multi-view semi-supervised learning problem, but also can be applied to other multi-view weakly labeled learning problem such as multi-view MIL. Finally, we observe that there are intrinsic differences between the crawled web training data and the testing images in our daily lives, which is also known as the domain adaptation problem. Particularly, we study the heterogeneous domain adaptation problem, in which the samples in source and target domains are with different feature representations. We build upon the recent Heterogeneous Feature Augmentation (HFA) method, and propose a convex reformulation of HFA, which can guarantee the global optimal solution. We further extend the HFA method to semi-supervised HFA (SHFA), in which we improve the learnt classifiers by exploiting the additional unlabeled data from the target domain. For all our proposed approaches, we conduct extensive experiments on publicly available datasets to demonstrate their effectiveness.en
dc.format.extent152 p.en
dc.language.isoenen
dc.subjectDRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognitionen
dc.titleVisual recognition by learning from web dataen
dc.typeThesisen
dc.contributor.supervisorXu Dongen
dc.contributor.schoolSchool of Computer Engineeringen
dc.description.degreeDOCTOR OF PHILOSOPHY (SCE)en
dc.contributor.researchCentre for Multimedia and Network Technologyen
dc.identifier.doi10.32657/10356/62186en
item.fulltextWith Fulltext-
item.grantfulltextopen-
Appears in Collections:SCSE Theses
Files in This Item:
File Description SizeFormat 
main_thesis.pdf4.71 MBAdobe PDFThumbnail
View/Open

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.