Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/164107
Title: Entropy guided attention network for weakly-supervised action localization
Authors: Cheng, Yi
Sun, Ying
Fan, Hehe
Zhuo, Tao
Lim, Joo-Hwee
Kankanhalli, Mohan
Keywords: Engineering::Computer science and engineering
Issue Date: 2022
Source: Cheng, Y., Sun, Y., Fan, H., Zhuo, T., Lim, J. & Kankanhalli, M. (2022). Entropy guided attention network for weakly-supervised action localization. Pattern Recognition, 129, 108718-. https://dx.doi.org/10.1016/j.patcog.2022.108718
Project: A18A2b0046
Journal: Pattern Recognition
Abstract: One major challenge of Weakly-supervised Temporal Action Localization (WTAL) is to handle diverse backgrounds in videos. To model background frames, most existing methods treat them as an additional action class. However, because background frames usually do not share common semantics, squeezing all the different background frames into a single class hinders network optimization. Moreover, the network would be confused and tends to fail when tested on videos with unseen background frames. To address this problem, we propose an Entropy Guided Attention Network (EGA-Net) to treat background frames as out-of-domain samples. Specifically, we design a two-branch module, where a domain branch detects whether a frame is an action by learning a class-agnostic attention map, and an action branch recognizes the action category of the frame by learning a class-specific attention map. By aggregating the two attention maps to model the joint domain-class distribution of frames, our EGA-Net can handle varying backgrounds. To train the class-agnostic attention map with only the video-level class labels, we propose an Entropy Guided Loss (EGL), which employs entropy as the supervision signal to distinguish action and background. Moreover, we propose a Global Similarity Loss (GSL) to enhance the action-specific attention map via action class center. Extensive experiments on THUMOS14, ActivityNet1.2 and ActivityNet1.3 datasets demonstrate the effectiveness of our EGA-Net.
URI: https://hdl.handle.net/10356/164107
ISSN: 0031-3203
DOI: 10.1016/j.patcog.2022.108718
Schools: School of Computer Science and Engineering 
Organisations: Institute for Infocomm Research, A*STAR
Centre for Frontier AI Research, A*STAR
Rights: © 2022 Elsevier Ltd. All rights reserved.
Fulltext Permission: none
Fulltext Availability: No Fulltext
Appears in Collections:SCSE Journal Articles

SCOPUSTM   
Citations 50

8
Updated on Feb 21, 2024

Web of ScienceTM
Citations 50

5
Updated on Oct 30, 2023

Page view(s)

81
Updated on Feb 28, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.