Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/178460
Full metadata record
DC FieldValueLanguage
dc.contributor.authorChen, Chaofengen_US
dc.contributor.authorZhou, Shangchenen_US
dc.contributor.authorLiao, Liangen_US
dc.contributor.authorWu, Haoningen_US
dc.contributor.authorSun, Wenxiuen_US
dc.contributor.authorYan, Qiongen_US
dc.contributor.authorLin, Weisien_US
dc.date.accessioned2024-06-21T02:04:13Z-
dc.date.available2024-06-21T02:04:13Z-
dc.date.issued2024-
dc.identifier.citationChen, C., Zhou, S., Liao, L., Wu, H., Sun, W., Yan, Q. & Lin, W. (2024). Iterative token evaluation and refinement for real-world super-resolution. 38th AAAI Conference on Artificial Intelligence (2024), 38, 1010-1018. https://dx.doi.org/10.1609/aaai.v38i2.27861en_US
dc.identifier.urihttps://hdl.handle.net/10356/178460-
dc.description.abstractReal-world image super-resolution (RWSR) is a longstanding problem as low-quality (LQ) images often have complex and unidentified degradations. Existing methods such as Generative Adversarial Networks (GANs) or continuous diffusion models present their own issues including GANs being difficult to train while continuous diffusion models requiring numerous inference steps. In this paper, we propose an Iterative Token Evaluation and Refinement (ITER) framework for RWSR, which utilizes a discrete diffusion model operating in the discrete token representation space, i.e., indexes of features extracted from a VQGAN codebook pre-trained with high-quality (HQ) images. We show that ITER is easier to train than GANs and more efficient than continuous diffusion models. Specifically, we divide RWSR into two sub-tasks, i.e., distortion removal and texture generation. Distortion removal involves simple HQ token prediction with LQ images, while texture generation uses a discrete diffusion model to iteratively refine the distortion removal output with a token refinement network. In particular, we propose to include a token evaluation network in the discrete diffusion process. It learns to evaluate which tokens are good restorations and helps to improve the iterative refinement results. Moreover, the evaluation network can first check status of the distortion removal output and then adaptively select total refinement steps needed, thereby maintaining a good balance between distortion removal and texture generation. Extensive experimental results show that ITER is easy to train and performs well within just 8 iterative steps.en_US
dc.language.isoenen_US
dc.rights© 2024 Association for the Advancement of Artifcial Intelligence (www.aaai.org). All rights reserved.en_US
dc.subjectComputer and Information Scienceen_US
dc.titleIterative token evaluation and refinement for real-world super-resolutionen_US
dc.typeConference Paperen
dc.contributor.schoolCollege of Computing and Data Scienceen_US
dc.contributor.schoolSchool of Computer Science and Engineeringen_US
dc.contributor.conference38th AAAI Conference on Artificial Intelligence (2024)en_US
dc.contributor.researchS-Laben_US
dc.identifier.doi10.1609/aaai.v38i2.27861-
dc.identifier.scopus2-s2.0-85189536364-
dc.identifier.urlhttps://ojs.aaai.org/index.php/AAAI/article/view/27861-
dc.identifier.volume38en_US
dc.identifier.spage1010en_US
dc.identifier.epage1018en_US
dc.subject.keywordsComputational photographyen_US
dc.subject.keywordsImage & video synthesisen_US
dc.citation.conferencelocationVancouver, Canadaen_US
dc.description.acknowledgementThis study is supported under the RIE2020 Industry Alignment Fund – Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner(s).en_US
item.grantfulltextnone-
item.fulltextNo Fulltext-
Appears in Collections:CCDS Conference Papers

Page view(s)

68
Updated on Oct 9, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.