Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/162629
Full metadata record
DC FieldValueLanguage
dc.contributor.authorYang, Xuen_US
dc.contributor.authorZhang, Hanwangen_US
dc.contributor.authorCai, Jianfeien_US
dc.date.accessioned2022-11-01T06:51:01Z-
dc.date.available2022-11-01T06:51:01Z-
dc.date.issued2021-
dc.identifier.citationYang, X., Zhang, H. & Cai, J. (2021). Deconfounded image captioning: a causal retrospect. IEEE Transactions On Pattern Analysis and Machine Intelligence, 3121705-. https://dx.doi.org/10.1109/TPAMI.2021.3121705en_US
dc.identifier.issn0162-8828en_US
dc.identifier.urihttps://hdl.handle.net/10356/162629-
dc.description.abstractDataset bias in vision-language tasks is becoming one of the main problems which hinders the progress of our community. Existing solutions lack a principled analysis about why modern image captioners easily collapse into dataset bias. In this paper, we present a novel perspective: Deconfounded Image Captioning (DIC), to find out the answer of this question, then retrospect modern neural image captioners, and finally propose a DIC framework: DICv1.0 to alleviate the negative effects brought by dataset bias. DIC is based on causal inference, whose two principles: the backdoor and front-door adjustments, help us review previous studies and design new effective models. In particular, we showcase that DICv1.0 can strengthen two prevailing captioning models and can achieve a single-model 131.1 CIDEr-D and 128.4 c40 CIDEr-D on Karpathy split and online split of the challenging MS COCO dataset, respectively. Interestingly, DICv1.0 is a natural derivation from our causal retrospect, which opens promising directions for image captioning.en_US
dc.language.isoenen_US
dc.relation.ispartofIEEE Transactions on Pattern Analysis and Machine Intelligenceen_US
dc.rights© 2021 IEEE. All rights reserved.en_US
dc.subjectEngineering::Computer science and engineeringen_US
dc.titleDeconfounded image captioning: a causal retrospecten_US
dc.typeJournal Articleen
dc.contributor.schoolSchool of Computer Science and Engineeringen_US
dc.identifier.doi10.1109/TPAMI.2021.3121705-
dc.identifier.pmid34673483-
dc.identifier.scopus2-s2.0-85123727842-
dc.identifier.spage3121705en_US
dc.subject.keywordsImage Captioningen_US
dc.subject.keywordsCausalityen_US
item.grantfulltextnone-
item.fulltextNo Fulltext-
Appears in Collections:SCSE Journal Articles

SCOPUSTM   
Citations 20

8
Updated on Jan 28, 2023

Page view(s)

15
Updated on Feb 3, 2023

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.