Visual understanding and personalization for an optimal recollection experience
Author
Ana, Garcia del Molino
Date of Issue
2019School
School of Computer Science and Engineering
Research Centre
Centre for Computational Intelligence
Related Organization
A*STAR
Abstract
The affordability of wearable cameras such as the Narrative Clip and GoPro allows mass-market consumers to continuously record their lives, producing large amounts of unstructured visual data. Moreover, users tend to record with their smartphones more multimedia content than they can possibly share or review. We use each of these devices for different purposes: action cameras for travels and adventures; our smartphones to capture on the spur of the moment; a lifelogging device to record unobtrusively all our daily life activities. As a result, the few important shots end up buried among many repetitive images or uninteresting long segments, requiring hours of manual analysis in order to, say, select highlights in a day or find the most aesthetic pictures.
Tackling challenges in end-to-end consumer video summarization, this thesis contributes to the state of the art in three major aspects: (i) Contextual Event Segmentation, an episodic event segmentation method that is able to detect boundaries between heterogeneous events and ignore local occlusions and brief diversions. CES improves the performance of the baselines by over 16% in F-measure, and is competitive with manual annotations. (ii) Personalized Highlight Detection, a highlight detector that is personalized via its inputs. The experimental results show that using the user history substantially improves the prediction accuracy. PHD outperforms the user-agnostic baselines even with only one single person-specific example. (iii) Active Video Summarization, an interactive approach to video exploration that gathers the user’s preferences while creating a video summary. AVS achieves an excellent compromise between usability and quality. The diverse and uniform nature of AVS summaries makes it alsoa valuable tool for browsing someone else’s visual collection. Additionally, this thesis contributes two large-scale datasets for First Person View video analysis, CSumm and R3, and a large-scale dataset for personalized video highlights, PHD2.
Subject
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
Type
Thesis
Collections
Related items
Showing items related by title, author, creator and subject.
-
Medical imaging algorithm research for diagnosis of ocular diseases
Tan, Ngan Meng (2015)Color retinal fundus images provide visual documentation of the health of a person's retina. With the widespread adoption of higher quality medical imaging techniques and data, there are increasing demands for medical ... -
Writing style modelling based on grapheme distributions : application to on-line writer identification.
Tan, Guoxian. (2013)The increasingly pervasive spread of mobile digital devices such as mobile smartphones or digital tablets that use digital pens brought about the emergence of a new class of documents; online handwritten documents. The ... -
Efficient feature extraction and classification for staining patterns of HEP-2 Cells
Xu, Xiang (2016)The occurrence of antinuclear antibodies (ANAs) in patient serum has significant relation to autoimmune diseases. The ANAs detection can be accomplished via indirect immunofluorescence (IIF) technique using human epithelial ...