Please use this identifier to cite or link to this item:
Title: Deep learning methods for weakly supervised video temporal action localization
Authors: Adipraja Widjaja, Sergi
Keywords: Engineering::Computer science and engineering
Engineering::Electrical and electronic engineering
Issue Date: 2020
Publisher: Nanyang Technological University
Project: A3274-191
Abstract: Deep Learning (DL) based method for analysing dynamic graphical data has been a vital part of emerging technologies. Video and image-based recommendation systems, smart capabilities on surveillance technologies, and smart sensors are a few examples of such technologies that are catalysed by DL. However, a growing concern is the increasingly complex annotation requirements for different tasks based on DL. One such task that we want to highlight is the video temporal action localization, which requires a multi-step approach on classifying and locating action instances in an untrimmed video. To build an effective video temporal action localization model, besides video datasets with only action labels, more comprehensive temporal annotation is also required. Unfortunately, this is not an accurate reflection of how video information is presented on the web where simple video tags may be used as action labels. Hence, weakly-supervised methods for temporal action localization quickly gained traction due to its minimal annotation requirement where only class action labels are needed for training. In this project, by aggregating and combining the merits of neural networks modules from past research works, a weakly-supervised temporal action localization method is proposed and developed. The theoretical basis on the design rationale of different neural network components is discussed and justified. Along with that, we will be studying the effectiveness of different neural network architectures for the weakly-supervised temporal action localization task. A comprehensive ablation study is done to compare different modules proposed by past works on weakly-supervised temporal action localization.
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:EEE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
  Restricted Access
3.72 MBAdobe PDFView/Open

Page view(s)

Updated on Jan 16, 2022

Download(s) 50

Updated on Jan 16, 2022

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.