Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/137906
Title: Debiasing visual question and answering with answer preference
Authors: Zhang, Xinye
Keywords: Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Issue Date: 2020
Publisher: Nanyang Technological University
Project: SCSE19-0193
Abstract: Visual Question Answering (VQA) requires models to generate a reasonable answer with given an image and corresponding question. It requires strong reasoning capabilities for two kinds of input features, namely image and question. However, most state-of-the-art results heavily rely on superficial correlations in the dataset, given it delicately balancing the dataset is almost impossible. In this paper, we proposed a simple method by using answer preference to reduce the impact of data bias and improve the robustness of VQA models against prior changes. Two pipelines of using answer preference, at the training stage as well as the inference stage, are experimented and achieved genuine improvement on the VQA-CP dataset. VQA-CP dataset is designed to test the performance of the VQA model under domain shift.
URI: https://hdl.handle.net/10356/137906
Schools: School of Computer Science and Engineering 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
FYP_REPORT_ZHANG_XINYE (1).pdf
  Restricted Access
3.45 MBAdobe PDFView/Open

Page view(s)

344
Updated on Mar 16, 2025

Download(s) 50

36
Updated on Mar 16, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.