Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/151341
Full metadata record
DC FieldValueLanguage
dc.contributor.authorXu, Yuecongen_US
dc.contributor.authorYang, Jianfeien_US
dc.contributor.authorMao, Kezhien_US
dc.date.accessioned2021-07-09T01:29:56Z-
dc.date.available2021-07-09T01:29:56Z-
dc.date.issued2019-
dc.identifier.citationXu, Y., Yang, J. & Mao, K. (2019). Semantic-filtered Soft-Split-Aware video captioning with audio-augmented feature. Neurocomputing, 357, 24-35. https://dx.doi.org/10.1016/j.neucom.2019.05.027en_US
dc.identifier.issn0925-2312en_US
dc.identifier.other0000-0002-8075-0439-
dc.identifier.urihttps://hdl.handle.net/10356/151341-
dc.description.abstractAutomatic video description, or video captioning, is a challenging yet much attractive task. It aims to combine video with text. Multiple methods have been proposed based on neural networks, utilizing Convolutional Neural Networks (CNN) to extract features, and Recurrent Neural Networks (RNN) to encode and decode videos to generate descriptions. Previously, a number of methods used in video captioning task are motivated by image captioning approaches. However, videos carry much more information than images. This increases the difficulty of video captioning task. Current methods commonly lack the ability to utilize the additional information provided by videos, especially the semantic and structural information of the videos. To address the above shortcoming, we propose a Semantic-Filtered Soft-Split-Aware-Gated LSTM (SF-SSAG-LSTM) model, that would improve video captioning quality by combining semantic concepts with audio-augmented feature extracted from input videos, while understanding the underlying structure of input videos. In the experiments, we quantitatively evaluate the performance of our model which matches other prominent methods on three benchmark datasets. We also qualitatively examine the result of our model, and show that our generated descriptions are more detailed and logical.en_US
dc.language.isoenen_US
dc.relation.ispartofNeurocomputingen_US
dc.rights© 2019 Elsevier B.V. All rights reserved.en_US
dc.subjectEngineering::Electrical and electronic engineeringen_US
dc.titleSemantic-filtered Soft-Split-Aware video captioning with audio-augmented featureen_US
dc.typeJournal Articleen
dc.contributor.schoolSchool of Electrical and Electronic Engineeringen_US
dc.identifier.doi10.1016/j.neucom.2019.05.027-
dc.identifier.scopus2-s2.0-85065823631-
dc.identifier.volume357en_US
dc.identifier.spage24en_US
dc.identifier.epage35en_US
dc.subject.keywordsVideo Captioningen_US
dc.subject.keywordsLong Short-term Memoryen_US
item.fulltextNo Fulltext-
item.grantfulltextnone-
Appears in Collections:EEE Journal Articles

Page view(s)

59
Updated on Oct 28, 2021

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.