Please use this identifier to cite or link to this item:
Title: Video frame synthesis via plug-and-play deep locally temporal embedding
Authors: Nguyen, Anh-Duc
Kim, Woojae
Kim, Jongyoo
Lin, Weisi
Lee, Sanghoon
Keywords: Engineering::Computer science and engineering
Issue Date: 2019
Source: Nguyen, A.-D., Kim, W., Kim, J., Lin, W., & Lee, S. (2019). Video frame synthesis via plug-and-play deep locally temporal embedding. IEEE Access, 7, 179304-179319. doi:10.1109/ACCESS.2019.2959019
Journal: IEEE Access
Abstract: We propose a generative framework that tackles video frame interpolation. Conventionally, optical flow methods can solve the problem, but the perceptual quality depends on the accuracy of flow estimation. Nevertheless, a merit of traditional methods is that they have a remarkable generalization ability. Recently, deep convolutional neural networks (CNNs) have achieved good performance at the price of computation. However, to deploy a CNN, it is necessary to train it with a large-scale dataset beforehand, not to mention the process of fine tuning and adaptation afterwards. Also, despite the sharp motion results, their perceptual quality does not correlate well with their pixel-to-pixel difference metric performance due to various artifacts created by erroneous warping. In this paper, we take the advantages of both conventional and deep-learning models, and tackle the problem from a different perspective. The framework, which we call deep locally temporal embedding (DeepLTE), is powered by a deep CNN and can be used instantly like conventional models. DeepLTE fits an auto-encoding CNN to several consecutive frames and embeds some constraints on the latent representations so that new frames can be generated by interpolating new latent codes. Unlike the current deep learning paradigm which requires training on large datasets, DeepLTE works in a plug-and-play and unsupervised manner, and is able to generate an arbitrary number of frames from multiple given consecutive frames. We demonstrate that, without bells and whistles, DeepLTE outperforms existing state-of-the-art models in terms of the perceptual quality.
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2019.2959019
Rights: © 2019 IEEE. This journal is 100% open access, which means that all content is freely available without charge to users or their institutions. All articles accepted after 12 June 2019 are published under a CC BY 4.0 license, and the author retains copyright. Users are allowed to read, download, copy, distribute, print, search, or link to the full texts of the articles, or use them for any other lawful purpose, as long as proper attribution is given.
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Journal Articles

Files in This Item:
File Description SizeFormat 
08931794.pdf4.17 MBAdobe PDFView/Open

Page view(s)

Updated on Jan 23, 2022


Updated on Jan 23, 2022

Google ScholarTM




Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.