Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/184037
Title: | Ethnic diversity in consistent text-to-image generation | Authors: | Mishra, Apurva | Keywords: | Computer and Information Science | Issue Date: | 2025 | Publisher: | Nanyang Technological University | Source: | Mishra, A. (2025). Ethnic diversity in consistent text-to-image generation. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/184037 | Abstract: | Present generative AI models successfully create a wide variety of high-quality images but face challenges in personalized and consistent visual story generation. This study evaluates the training-free ConsiStory approach for generating image sets with and without ethnic diversity modifiers in the text prompts. This is achieved by comparing datasets curated for this study: a Benchmark Dataset without diversity keywords and a Diverse Dataset with ethnic groups specified in the prompts. The datasets have 100 corresponding sets of prompts and the images generated with them. DreamSim was used to evaluate the subject consistency in the set of images, while CLIPScore measured the performance in terms of image alignment to the text prompt. Quantitative experiments revealed if there is a statistically significant difference between the performance of corresponding sets in Diverse and Benchmark Datasets. Furthermore, qualitative evaluation is conducted to investigate the composition of the generated image sets in relation to the prompts, which can act as heuristics for downstream tasks. The findings contribute to advancing practical and socially aware text-to-image generative AI applications. | URI: | https://hdl.handle.net/10356/184037 | Schools: | College of Computing and Data Science | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | CCDS Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Mishra_Apurva_FYP.pdf Restricted Access | 2.03 MB | Adobe PDF | View/Open |
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.