Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/142314
Full metadata record
DC FieldValueLanguage
dc.contributor.authorZhang, Mingxingen_US
dc.contributor.authorYang, Yangen_US
dc.contributor.authorZhang, Hanwangen_US
dc.contributor.authorJi, Yanlien_US
dc.contributor.authorShen, Heng Taoen_US
dc.contributor.authorChua, Tat-Sengen_US
dc.date.accessioned2020-06-19T02:54:56Z-
dc.date.available2020-06-19T02:54:56Z-
dc.date.issued2018-
dc.identifier.citationZhang, M., Yang, Y., Zhang, H., Ji, Y., Shen, H. T., & Chua, T.-S. (2019). More is better : precise and detailed image captioning using online positive recall and missing concepts mining. IEEE Transactions on Image Processing, 28(1), 32-44. doi:10.1109/TIP.2018.2855415en_US
dc.identifier.issn1057-7149en_US
dc.identifier.urihttps://hdl.handle.net/10356/142314-
dc.description.abstractRecently, a great progress in automatic image captioning has been achieved by using semantic concepts detected from the image. However, we argue that existing concepts-to-caption framework, in which the concept detector is trained using the image-caption pairs to minimize the vocabulary discrepancy, suffers from the deficiency of insufficient concepts. The reasons are two-fold: 1) the extreme imbalance between the number of occurrence positive and negative samples of the concept and 2) the incomplete labeling in training captions caused by the biased annotation and usage of synonyms. In this paper, we propose a method, termed online positive recall and missing concepts mining, to overcome those problems. Our method adaptively re-weights the loss of different samples according to their predictions for online positive recall and uses a two-stage optimization strategy for missing concepts mining. In this way, more semantic concepts can be detected and a high accuracy will be expected. On the caption generation stage, we explore an element-wise selection process to automatically choose the most suitable concepts at each time step. Thus, our method can generate more precise and detailed caption to describe the image. We conduct extensive experiments on the MSCOCO image captioning data set and the MSCOCO online test server, which shows that our method achieves superior image captioning performance compared with other competitive methods.en_US
dc.language.isoenen_US
dc.relation.ispartofIEEE Transactions on Image Processingen_US
dc.rights© 2018 IEEE. All rights reserved.en_US
dc.subjectEngineering::Computer science and engineeringen_US
dc.titleMore is better : precise and detailed image captioning using online positive recall and missing concepts miningen_US
dc.typeJournal Articleen
dc.contributor.schoolSchool of Computer Science and Engineeringen_US
dc.identifier.doi10.1109/TIP.2018.2855415-
dc.identifier.pmid30010565-
dc.identifier.scopus2-s2.0-85049964023-
dc.identifier.issue1en_US
dc.identifier.volume28en_US
dc.identifier.spage32en_US
dc.identifier.epage44en_US
dc.subject.keywordsPrecise and Detailed Image Captioningen_US
dc.subject.keywordsSemantic Conceptsen_US
item.fulltextNo Fulltext-
item.grantfulltextnone-
Appears in Collections:SCSE Journal Articles

SCOPUSTM   
Citations 5

63
Updated on Mar 23, 2024

Web of ScienceTM
Citations 5

53
Updated on Oct 26, 2023

Page view(s)

187
Updated on Mar 28, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.