Please use this identifier to cite or link to this item:
Title: Visual understanding and personalization for an optimal recollection experience
Authors: Ana, Garcia del Molino
Keywords: Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
Issue Date: 2019
Source: Ana, G. d. M. (2019). Visual understanding and personalization for an optimal recollection experience. Doctoral thesis, Nanyang Technological University, Singapore.
Abstract: The affordability of wearable cameras such as the Narrative Clip and GoPro allows mass-market consumers to continuously record their lives, producing large amounts of unstructured visual data. Moreover, users tend to record with their smartphones more multimedia content than they can possibly share or review. We use each of these devices for different purposes: action cameras for travels and adventures; our smartphones to capture on the spur of the moment; a lifelogging device to record unobtrusively all our daily life activities. As a result, the few important shots end up buried among many repetitive images or uninteresting long segments, requiring hours of manual analysis in order to, say, select highlights in a day or find the most aesthetic pictures. Tackling challenges in end-to-end consumer video summarization, this thesis contributes to the state of the art in three major aspects: (i) Contextual Event Segmentation, an episodic event segmentation method that is able to detect boundaries between heterogeneous events and ignore local occlusions and brief diversions. CES improves the performance of the baselines by over 16% in F-measure, and is competitive with manual annotations. (ii) Personalized Highlight Detection, a highlight detector that is personalized via its inputs. The experimental results show that using the user history substantially improves the prediction accuracy. PHD outperforms the user-agnostic baselines even with only one single person-specific example. (iii) Active Video Summarization, an interactive approach to video exploration that gathers the user’s preferences while creating a video summary. AVS achieves an excellent compromise between usability and quality. The diverse and uniform nature of AVS summaries makes it alsoa valuable tool for browsing someone else’s visual collection. Additionally, this thesis contributes two large-scale datasets for First Person View video analysis, CSumm and R3, and a large-scale dataset for personalized video highlights, PHD2.
DOI: 10.32657/10356/82932
Schools: School of Computer Science and Engineering 
Organisations: A*STAR
Research Centres: Centre for Computational Intelligence 
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Theses

Files in This Item:
File Description SizeFormat 
thesis.pdfA thesis submitted to the Nanyang Technological University in partial fulfillment of the requirement for the degree of Doctor of Philosophy23.75 MBAdobe PDFThumbnail

Page view(s) 50

Updated on Jun 18, 2024

Download(s) 20

Updated on Jun 18, 2024

Google ScholarTM




Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.