Please use this identifier to cite or link to this item:
Title: Multi-domain anime image generation and editing
Authors: Aravind S/O Sivakumaran
Keywords: Engineering::Computer science and engineering
Issue Date: 2022
Publisher: Nanyang Technological University
Source: Aravind S/O Sivakumaran (2022). Multi-domain anime image generation and editing. Final Year Project (FYP), Nanyang Technological University, Singapore.
Project: SCSE21-0664
Abstract: Generative models such as text-to-image and image-to-image have been very successful to date. Some successful models include OpenAI's DALLE-2, Google's Imagen, and Parti. However, these state-of-the-art (SOTA) Diffusion models are hard to train, and finetuning them requires resources many may not have, unlike GANs. GANs, unlike Diffusion Models, have a faster inference process and could better integrate into production workflows with tight deadlines. Therefore, we propose training GAN models using our end-to-end framework while extending existing GANs networks to multi-domains to enable integration into existing training workflows. We aim to introduce text-to-image multimodal generation for existing StyleGAN2 networks that can be used for editing while allowing extension to different style domains using StyleGAN-NADA. Additionally, as part of our model editing workflow, existing StyleGAN2 network outputs can be passed to a Diffusion Model such as Stable Diffusion for image-to-image translation for image editing purposes. Finally, we can explore slimming down the StyleGAN2 network for faster inference on edge devices, as StyleGAN2 is computationally intensive for edge devices to handle. Keywords: Anime, StyleGAN, Generative Adversarial Networks, Image-to-Image translation, Text-to-Image translation, Image Editing, Model Compression, Multi Domain, Diffusion Models, CLIP
Schools: School of Computer Science and Engineering 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
  Restricted Access
4.52 MBAdobe PDFView/Open

Page view(s)

Updated on Feb 27, 2024

Download(s) 50

Updated on Feb 27, 2024

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.