Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/156034
Title: | Cycle-consistent inverse GAN for text-to-image synthesis | Authors: | Wang, Hao Lin, Guosheng Hoi, Steven C. H. Miao, Chunyan |
Keywords: | Engineering::Computer science and engineering | Issue Date: | 2021 | Source: | Wang, H., Lin, G., Hoi, S. C. H. & Miao, C. (2021). Cycle-consistent inverse GAN for text-to-image synthesis. 29th ACM International Conference on Multimedia (MM '21), 630-638. https://dx.doi.org/10.1145/3474085.3475226 | Project: | AISG-GC-2019-003 NRF-NRFI05-2019-0002 MOH/NIC/COG04/2017 MOH/NIC/HAIG03/2017 RG28/18 (S) RG22/19 (S) |
metadata.dc.contributor.conference: | 29th ACM International Conference on Multimedia (MM '21) | Abstract: | This paper investigates an open research task of text-to-image synthesis for automatically generating or manipulating images from text descriptions. Prevailing methods mainly take the textual descriptions as the conditional input for the GAN generation, and need to train different models for the text-guided image generation and manipulation tasks. In this paper, we propose a novel unified framework of Cycle-consistent Inverse GAN (CI-GAN) for both text-to-image generation and text-guided image manipulation tasks. Specifically, we first train a GAN model without text input, aiming to generate images with high diversity and quality. Then we learn a GAN inversion model to convert the images back to the GAN latent space and obtain the inverted latent codes for each image, where we introduce the cycle-consistency training to learn more robust and consistent inverted latent codes. We further uncover the semantics of the latent space of the trained GAN model, by learning a similarity model between text representations and the latent codes. In the text-guided optimization module, we can generate images with the desired semantic attributes through optimization on the inverted latent codes. Extensive experiments on the Recipe1M and CUB datasets validate the efficacy of our proposed framework. | URI: | https://hdl.handle.net/10356/156034 | ISBN: | 9781450386517 | DOI: | 10.1145/3474085.3475226 | Schools: | School of Computer Science and Engineering | Research Centres: | Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly (LILY) | Rights: | © 2021 Association for Computing Machinery. All rights reserved. This paper was published in Proceedings of the 29th ACM International Conference on Multimedia (MM' 21) and is made available with permission of Association for Computing Machinery. | Fulltext Permission: | open | Fulltext Availability: | With Fulltext |
Appears in Collections: | SCSE Conference Papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Cycle-Consistent Inverse GAN for Text-to-Image Synthesis.pdf | 3.24 MB | Adobe PDF | ![]() View/Open |
SCOPUSTM
Citations
20
14
Updated on Sep 27, 2023
Page view(s)
148
Updated on Sep 30, 2023
Download(s) 50
101
Updated on Sep 30, 2023
Google ScholarTM
Check
Altmetric
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.