On scalable to lossless audio coding
Date of Issue2008
School of Electrical and Electronic Engineering
A*STAR Institute for Infocomm Research
With advances in broadband network and storage technologies, more and more digital audio applications are ready to deliver high sampling rate and high-resolution lossless audio. On the other hand, there are also applications that require highly compressed audio such as those found in wireless communications. To deal with these various demands, a scalable audio coding technology that supports both lossy and lossless audio compression is thus desirable. MPEG-4 audio scalable lossless (SLS) coding was published as an international standard in June 2006. It allows the scaling up of a perceptually coded audio to a fully lossless audio with a wide range of intermediate bitrate representations. The main technologies adopted in SLS include the latest integer transform, namely integer modified discrete cosine transform (IntMDCT), and a new entropy coder that is based on bit-plane Golomb code (BPGC). As a relatively new coding structure, SLS is far from perfect with room for improvements. There are several critical issues which directly affect the wide adoption and application of the SLS codec. This dissertation aims to provide answers to all these critical issues. Firstly, the effect of the rounding errors introduced by IntMDCT under the perceptual (lossy) audio coding scenario is studied. Based on intensive test results,it is concluded that MDCT and IntMDCT filterbanks are interchangeable in a lossy coding scenario. This finding justifies the use of the low-complexity SLS structure. Secondly, perceptually enhanced prioritized bit-plane audio coding algorithms are proposed for the non-core and low-core-bitrate mode of SLS based on the energy distributions in different frequency regions. By using only a single bit in each frame to indicate one of the two coding models to be used, considerable perceptual quality enhancement is achieved for a wide range of bitrates. Thirdly, efficient bit allocation schemes for stereo channels in both the SLS encoder and truncator are proposed. By allocating bits according to the energy level, significant improvement in quality can be achieved by the proposed algorithm for signal (such as speech) that is highly correlated for the left and right channels. Lastly, a “smart” function is designed for SLS. With a low quality audio format and its original inputs, the proposed smart enhancing process enables a scalable encoder to automatically encode the minimum amount of enhancement for the low quality audio to attain a “transparent quality” that is the same as the CD quality. This function facilitates the application of SLS in multi-quality online music sales. With these proposed solutions, the MPEG-4 SLS coder has been enhanced resulting in a much better perceptual quality and more robust features. The users can benefit from the convenience of the universality, as well as the excellent performance in terms of both the quality and compression, of this codec. Finally, several interesting research topics for scalable lossless coding are also recommended for future research.
DRNTU::Engineering::Computer science and engineering::Data::Coding and information theory