Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/182918
Title: Mitigating style-image hallucination in large vision language models
Authors: He, Guoshun
Keywords: Engineering
Issue Date: 2025
Publisher: Nanyang Technological University
Source: He, G. (2025). Mitigating style-image hallucination in large vision language models. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/182918
Abstract: LLMs are widely applied across various domains, yet a significant challenge remains—their performance deteriorates sharply in out-of-domain scenarios, often leading to increased hallucinations. Despite its importance, this phenomenon has received limited attention in academic research. To address this, we first construct a benchmark dataset using style transfer techniques and employ it to evaluate the out-of-domain performance of several popular large-scale models. Building upon these findings, we introduce CopeCap, a lightweight image captioning model that leverages collaborative prompting to achieve strong out-of-domain performance without requiring additional training.
URI: https://hdl.handle.net/10356/182918
Schools: School of Electrical and Electronic Engineering 
Fulltext Permission: embargo_restricted_20270311
Fulltext Availability: With Fulltext
Appears in Collections:EEE Theses

Files in This Item:
File Description SizeFormat 
NTU_EEE_MSc_Dissertation_He Guoshun(1).pdf
  Until 2027-03-11
6.09 MBAdobe PDFUnder embargo until Mar 11, 2027

Page view(s)

61
Updated on Mar 21, 2025

Download(s)

2
Updated on Mar 21, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.