Please use this identifier to cite or link to this item:
Title: Visual signal coding and quality evaluation
Authors: Liu, Anmin
Keywords: DRNTU::Engineering::Computer science and engineering::Data::Coding and information theory
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Issue Date: 2011
Source: Liu, A. M. (2011). Visual signal coding and quality evaluation . Doctoral thesis, Nanyang Technological University, Singapore.
Abstract: Visual signal (i.e., images and videos) coding is to compress digital visual data to be as small in size as possible in order to make use of limited bandwidth of networks and cater for compact storage, by exploring various data redundancy. It exploits the redundancy in signal itself (statistical redundancy, i.e., spatial-temporal redundancy and spectral/color redundancy). Since the human visual system (HVS) is the ultimate receiver and appreciator of most processed visual signal, we should also consider the redundancy due to the human vision properties (i.e., perceptual/psycho-visual redundancy) in the course of coding. The effectiveness of image and video coding methods is traditionally evaluated with their rate-distortion (RD) performance where rate is the number of bits required for the compressed visual signal (or its variants such as bits per pixel (bpp) and bits per second) and distortion is usually measured as peak signal to noise ratio (PSNR). However, it has been found that PSNR is not always in accordance with the human judgment and therefore the measurement for perceptual distortion is an active research area. Firstly, in this work, we discuss the statistical redundancy of video and then propose a novel optimal compression plane (OCP) based video coding scheme. In the sense of data structure, video is nothing more than a three dimensional data matrix, and the distinction among X (a spatial dimension), Y (the other spatial dimension), and T (the temporal dimension) is not absolutely necessary. We ignore the physical meaning of X, Y, and T axes for a video during the video coding process; frames are allowed to be formed in the TX (or TY) plane rather than the traditional XY plane to exploit the redundancy more effectively, and therefore better coder performance is achieved. Secondly, the model reflecting the masking characteristics of the HVS is studied as it is fundamental for perceptual redundancy exploring and visual distortion (quality) measurement. Just noticeable difference (JND) accounts for various masking effects of the HVS. We improve the pixel domain JND model by better contrast masking (CM) evaluation and appropriately accounting for the difference of CM for textural and edge regions. We also investigate into the application of the perceptual models (i.e., visual attention model and JND model) in the context of adaptive sampling based low-bit-rate image coding and JND based histogram adjustment for visually lossless image coding. Lastly, an effective and efficient metric of visual quality/distortion evaluation is proposed. The metric is based on the similarity between the gradient profiles of the reference and distorted signals which accounts for both the high level premise of the HVS (i.e., high sensitivity to image edges and structure) and the masking property. This new metric is with simple calculation and high accuracy (verified with extensive cross-database tests); it is robust to various distortion types and can be easily embedded in coding systems (as well as other visual signal processing algorithms).
DOI: 10.32657/10356/47587
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Theses

Files in This Item:
File Description SizeFormat 
TsceG0701825G.pdfMain article3.98 MBAdobe PDFThumbnail

Page view(s)

Updated on Nov 25, 2020


Updated on Nov 25, 2020

Google ScholarTM




Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.