Please use this identifier to cite or link to this item:
|Title:||Decomposing generation networks with structure prediction for recipe generation||Authors:||Wang, Hao
Hoi, Steven C. H.
|Keywords:||Engineering::Computer science and engineering||Issue Date:||2022||Source:||Wang, H., Lin, G., Hoi, S. C. H. & Miao, C. (2022). Decomposing generation networks with structure prediction for recipe generation. Pattern Recognition, 126, 108578-. https://dx.doi.org/10.1016/j.patcog.2022.108578||Project:||AISG-GC-2019-003
|Journal:||Pattern Recognition||Abstract:||Recipe generation from food images and ingredients is a challenging task, which requires the interpretation of the information from another modality. Different from the image captioning task, where the captions usually have one sentence, cooking instructions contain multiple sentences and have obvious structures. To help the model capture the recipe structure and avoid missing some cooking details, we propose a novel framework: Decomposing Generation Networks (DGN) with structure prediction, to get more structured and complete recipe generation outputs. Specifically, we split each cooking instruction into several phases, and assign different sub-generators to each phase. Our approach includes two novel ideas: (i) learning the recipe structures with the global structure prediction component and (ii) producing recipe phases in the sub-generator output component based on the predicted structure. Extensive experiments on the challenging large-scale Recipe1M dataset validate the effectiveness of our proposed model, which improves the performance over the state-of-the-art results.||URI:||https://hdl.handle.net/10356/156089||ISSN:||0031-3203||DOI:||10.1016/j.patcog.2022.108578||Schools:||School of Computer Science and Engineering||Research Centres:||Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly (LILY)||Rights:||© 2022 Elsevier Ltd. All rights reserved. This paper was published in Pattern Recognition and is made available with permission of Elsevier Ltd.||Fulltext Permission:||embargo_20240707||Fulltext Availability:||With Fulltext|
|Appears in Collections:||SCSE Journal Articles|
Files in This Item:
|3.72 MB||Adobe PDF||Under embargo until Jul 07, 2024|
Updated on Nov 30, 2023
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.