Mining actionlet ensemble for action recognition with depth cameras
Author
Wang, Jiang
Liu, Zicheng
Wu, Ying
Yuan, Junsong
Date of Issue
2012Conference Name
IEEE Conference on Computer Vision and Pattern Recognition (2012 : Providence, Rhode Island, US)
School
School of Electrical and Electronic Engineering
Version
Accepted version
Abstract
Human action recognition is an important yet challenging task. The recently developed commodity depth sensors open up new possibilities of dealing with this problem but also present some unique challenges. The depth maps captured by the depth cameras are very noisy and the 3D positions of the tracked joints may be completely wrong if serious occlusions occur, which increases the intra-class variations in the actions. In this paper, an actionlet ensemble model is learnt to represent each action and to capture the intra-class variance. In addition, novel features that are suitable for depth data are proposed. They are robust to noise, invariant to translational and temporal misalignments, and capable of characterizing both the human motion and the human-object interactions. The proposed approach is evaluated on two challenging action recognition datasets captured by commodity depth cameras, and another dataset captured by a MoCap system. The experimental evaluations show that the proposed approach achieves superior performance to the state of the art algorithms.
Subject
DRNTU::Engineering::Electrical and electronic engineering
Type
Conference Paper
Rights
© 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/CVPR.2012.6247813].
Collections
http://dx.doi.org/10.1109/CVPR.2012.6247813
Get published version (via Digital Object Identifier)