Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/184037
Title: Ethnic diversity in consistent text-to-image generation
Authors: Mishra, Apurva
Keywords: Computer and Information Science
Issue Date: 2025
Publisher: Nanyang Technological University
Source: Mishra, A. (2025). Ethnic diversity in consistent text-to-image generation. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/184037
Abstract: Present generative AI models successfully create a wide variety of high-quality images but face challenges in personalized and consistent visual story generation. This study evaluates the training-free ConsiStory approach for generating image sets with and without ethnic diversity modifiers in the text prompts. This is achieved by comparing datasets curated for this study: a Benchmark Dataset without diversity keywords and a Diverse Dataset with ethnic groups specified in the prompts. The datasets have 100 corresponding sets of prompts and the images generated with them. DreamSim was used to evaluate the subject consistency in the set of images, while CLIPScore measured the performance in terms of image alignment to the text prompt. Quantitative experiments revealed if there is a statistically significant difference between the performance of corresponding sets in Diverse and Benchmark Datasets. Furthermore, qualitative evaluation is conducted to investigate the composition of the generated image sets in relation to the prompts, which can act as heuristics for downstream tasks. The findings contribute to advancing practical and socially aware text-to-image generative AI applications.
URI: https://hdl.handle.net/10356/184037
Schools: College of Computing and Data Science 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:CCDS Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
Mishra_Apurva_FYP.pdf
  Restricted Access
2.03 MBAdobe PDFView/Open

Page view(s)

25
Updated on May 7, 2025

Download(s)

4
Updated on May 7, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.