Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/175157
Title: Instruction-guided image editing empowered by large language models
Authors: Wang, Yiying
Keywords: Computer and Information Science
Issue Date: 2024
Publisher: Nanyang Technological University
Source: Wang, Y. (2024). Instruction-guided image editing empowered by large language models. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175157
Abstract: This final year project is mainly focused on developing a compositional framework which enables an user to edit user-provided photos using natural language instructions. Theproposedapproachavoidsaresource-demandingtraining process by leveraging the impressive reasoning ability of large language models (LLM) as well as off-the-shelf visual models which have demonstrated remarkable zero-shot performance in diverse scenarios. Meanwhile, as the framework is highly modularized, the functionalities of the framework are expected to be further extended in the future along with the advancement of cutting-edge computer vision models. The experiment results have proven that the framework is able to produce delightful outcome. Furthermore, a web demo is created for providing a straightforward and user-friendly graphical interface, enhancing the framework’s interactivity.
URI: https://hdl.handle.net/10356/175157
Schools: School of Computer Science and Engineering 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
NTU_SCSE_FYP_WANG_YIYING.pdf
  Restricted Access
19.21 MBAdobe PDFView/Open

Page view(s)

74
Updated on May 7, 2025

Download(s)

5
Updated on May 7, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.