Please use this identifier to cite or link to this item:
Title: Temporal feature extraction for video-based activity recognition
Authors: Chen, Zhiyang
Keywords: Engineering::Electrical and electronic engineering
Issue Date: 2022
Publisher: Nanyang Technological University
Source: Chen, Z. (2022). Temporal feature extraction for video-based activity recognition. Master's thesis, Nanyang Technological University, Singapore.
Abstract: With the development of modern media, video understanding has become a heated research topic. Convolutional Neural Network(CNN) has been proven to be very effective in the image classification task. But simply applying traditional CNN on the video action recognition task is not feasible because it cannot learn the motion information. In this dissertation, we study two mainstream temporal feature extraction methods at present, two-stream CNN and 3D CNN, together with their variants. The following conclusions can be obtained from our work: (i) 3D CNN models are more prone to overfit and a small video dataset is not sufficient to train a deep 3D CNN model. Transferring and fine-tuning the pre-trained model can help to solve the problem. (ii) We can improve the performance of two-stream CNN by building interaction features between two-stream features after a late convolutional layer. (iii) Factorizing 3D convolution into separate 2D and 1D convolution can boost the performance of 3D CNN. (iv) Using optical flow input in 3D CNN can also improve the prediction accuracy.
Schools: School of Electrical and Electronic Engineering 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:EEE Theses

Files in This Item:
File Description SizeFormat 
[CHEN ZHIYANG]-Amended dissertation.pdf
  Restricted Access
3.49 MBAdobe PDFView/Open

Page view(s)

Updated on Dec 10, 2023


Updated on Dec 10, 2023

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.