Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/173795
Title: | Towards efficient video-based action recognition: context-aware memory attention network | Authors: | Koh, Thean Chun Yeo, Chai Kiat Jing, Xuan Sivadas, Sunil |
Keywords: | Computer and Information Science | Issue Date: | 2023 | Source: | Koh, T. C., Yeo, C. K., Jing, X. & Sivadas, S. (2023). Towards efficient video-based action recognition: context-aware memory attention network. SN Applied Sciences, 5(12). https://dx.doi.org/10.1007/s42452-023-05568-5 | Journal: | SN Applied Sciences | Abstract: | Given the prevalence of surveillance cameras in our daily lives, human action recognition from videos holds significant practical applications. A persistent challenge in this field is to develop more efficient models capable of real-time recognition with high accuracy for widespread implementation. In this research paper, we introduce a novel human action recognition model named Context-Aware Memory Attention Network (CAMA-Net), which eliminates the need for optical flow extraction and 3D convolution which are computationally intensive. By removing these components, CAMA-Net achieves superior efficiency compared to many existing approaches in terms of computation efficiency. A pivotal component of CAMA-Net is the Context-Aware Memory Attention Module, an attention module that computes the relevance score between key-value pairs obtained from the 2D ResNet backbone. This process establishes correspondences between video frames. To validate our method, we conduct experiments on four well-known action recognition datasets: ActivityNet, Diving48, HMDB51 and UCF101. The experimental results convincingly demonstrate the effectiveness of our proposed model, surpassing the performance of existing 2D-CNN based baseline models. Article Highlights: Recent human action recognition models are not yet ready for practical applications due to high computation needs. We propose a 2D CNN-based human action recognition method to reduce the computation load. The proposed method achieves competitive performance compared to most SOTA 2D CNN-based methods on public datasets. | URI: | https://hdl.handle.net/10356/173795 | ISSN: | 2523-3971 | DOI: | 10.1007/s42452-023-05568-5 | Schools: | School of Computer Science and Engineering | Organisations: | NCS Pte Ltd, Singapore | Rights: | © The Author(s) 2023. Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/. | Fulltext Permission: | open | Fulltext Availability: | With Fulltext |
Appears in Collections: | SCSE Journal Articles |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
s42452-023-05568-5.pdf | 1.06 MB | Adobe PDF | ![]() View/Open |
SCOPUSTM
Citations
50
4
Updated on Mar 12, 2025
Page view(s)
91
Updated on Mar 15, 2025
Download(s) 50
47
Updated on Mar 15, 2025
Google ScholarTM
Check
Altmetric
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.