Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/146884
Title: Stacked attention networks for referring expressions comprehension
Authors: Li, Yugang
Sun, Haibo
Chen, Zhe
Ding, Yudan
Zhou, Siqi
Keywords: Engineering::Computer science and engineering
Issue Date: 2020
Source: Li, Y., Sun, H., Chen, Z., Ding, Y. & Zhou, S. (2020). Stacked attention networks for referring expressions comprehension. Computers, Materials and Continua, 65(3), 2529-2541. https://dx.doi.org/10.32604/cmc.2020.011886
Journal: Computers, Materials and Continua
Abstract: Referring expressions comprehension is the task of locating the image region described by a natural language expression, which refer to the properties of the region or the relationships with other regions. Most previous work handles this problem by selecting the most relevant regions from a set of candidate regions, when there are many candidate regions in the set these methods are inefficient. Inspired by recent success of image captioning by using deep learning methods, in this paper we proposed a framework to understand the referring expressions by multiple steps of reasoning. We present a model for referring expressions comprehension by selecting the most relevant region directly from the image. The core of our model is a recurrent attention network which can be seen as an extension of Memory Network. The proposed model capable of improving the results by multiple computational hops. We evaluate the proposed model on two referring expression datasets: Visual Genome and Flickr30k Entities. The experimental results demonstrate that the proposed model outperform previous state-of-the-art methods both in accuracy and efficiency. We also conduct an ablation experiment to show that the performance of the model is not getting better with the increase of the attention layers.
URI: https://hdl.handle.net/10356/146884
ISSN: 1546-2218
DOI: 10.32604/cmc.2020.011886
Schools: School of Electrical and Electronic Engineering 
Rights: © 2020 The Author(s). This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:EEE Journal Articles

Files in This Item:
File Description SizeFormat 
TSP_CMC_40185.pdf792.06 kBAdobe PDFThumbnail
View/Open

SCOPUSTM   
Citations 50

1
Updated on Mar 16, 2025

Web of ScienceTM
Citations 50

1
Updated on Oct 29, 2023

Page view(s)

282
Updated on Mar 17, 2025

Download(s) 50

91
Updated on Mar 17, 2025

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.