Please use this identifier to cite or link to this item:
Title: Skeleton-based human activity understanding
Authors: Liu, Jun
Keywords: Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Issue Date: 2019
Source: Liu, J. (2019). Skeleton-based human activity understanding. Doctoral thesis, Nanyang Technological University, Singapore.
Abstract: Human activity understanding is an important research problem due to its relevance to a wide range of applications. Recently, 3D skeleton-based activity analysis becomes popular due to its succinctness, robustness, and view-invariant representation. In this thesis, we focus on human activity understanding in 3D skeleton sequences. Recent works attempted to utilize recurrent neural networks (RNNs) and long short-term memory (LSTM) networks to model the temporal dependencies between the 3D positional configurations of human body joints for better analysis of human activities in the 3D skeletal data. As the first work of this thesis, we apply recurrent analysis to spatial domain as well as temporal domain to better analyze the hidden sources of action-related information within the human skeleton sequences in both of these domains simultaneously. Based on the pictorial structure of Kinect's skeletal data, an effective tree-structure based traversal framework is also proposed. In order to deal with the noise in the skeletal data, a new gating mechanism within LSTM module is introduced, with which the network can learn the reliability of the sequential data and accordingly adjust the effect of the input data on the updating procedure of the long-term context representation stored in the unit's memory cell. The comprehensive experimental results on seven challenging benchmark datasets for human action recognition demonstrate the effectiveness of the proposed method. In skeleton-based action recognition, not all skeletal joints are informative for activity analysis, and the irrelevant joints often bring noise which can degrade the performance. Therefore, we need to pay more attention to the informative ones. However, the original LSTM network does not have explicit attention ability. In our second piece of work, we propose a new class of LSTM network, global context-aware attention LSTM, for skeleton-based action recognition, which is capable of selectively focusing on the informative joints in each frame by using a global context memory cell. The proposed method achieves state-of-the-art performance on five challenging datasets for skeleton-based action recognition. The aforementioned two works focus on action recognition in well-segmented skeleton sequences, in which each sequence includes one action sample and we need to recognize its class. In the third work, we focus on online action prediction in untrimmed streaming skeleton data, in which each sequence contains multiple action samples and we need to recognize the class label of the current ongoing activity when only a part of it is observed. A dilated convolutional network is introduced to model the motion dynamics in temporal dimension via a sliding window over the temporal axis for online action prediction. As there are significant temporal scale variations in the observed part of the ongoing action at different time steps, a novel window scale selection method is proposed, which makes our network focus on the performed part of the ongoing action and suppress the possible incoming interference from the previous actions. The proposed approach is evaluated on four challenging datasets. The extensive experiments demonstrate the effectiveness of the proposed method for skeleton-based online action prediction.
DOI: 10.32657/10220/49510
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:EEE Theses

Files in This Item:
File Description SizeFormat 
thesis_LiuJun.pdf8.21 MBAdobe PDFThumbnail

Page view(s) 50

Updated on Jun 14, 2021

Download(s) 20

Updated on Jun 14, 2021

Google ScholarTM




Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.