Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/98409
Title: Joint spectral and temporal normalization of features for robust recognition of noisy and reverberated speech
Authors: Xiao, Xiong
Chng, Eng Siong
Li, Haizhou
Keywords: DRNTU::Engineering::Computer science and engineering
Issue Date: 2012
Source: Xiao, X., Chng, E. S., & Li, H. (2012). Joint spectral and temporal normalization of features for robust recognition of noisy and reverberated speech. 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4325-4328.
Conference: IEEE International Conference on Acoustics, Speech and Signal Processing (2012 : Kyoto, Japan)
Abstract: In this paper, we propose a framework for joint normalization of spectral and temporal statistics of speech features for robust speech recognition. Current feature normalization approaches normalize the spectral and temporal aspects of feature statistics separately to overcome noise and reverberation. As a result, the interaction between the spectral normalization (e.g. mean and variance normalization, MVN) and temporal normalization (e.g. temporal structure normalization, TSN) is ignored. We propose a joint spectral and temporal normalization (JSTN) framework to simultaneously normalize these two aspects of feature statistics. In JSTN, feature trajectories are filtered by linear filters and the filters' coefficients are optimized by maximizing a likelihood-based objective function. Experimental results on Aurora-5 benchmark task show that JSTN consistently out-performs the cascade of MVN and TSN on test data corrupted by both additive noise and reverberation, which validates our proposal. Specifically, JSTN reduces average word error rate by 8-9% relatively over the cascade of MVN and TSN for both artificial and real noisy data.
URI: https://hdl.handle.net/10356/98409
http://hdl.handle.net/10220/13398
DOI: 10.1109/ICASSP.2012.6288876
Schools: School of Computer Engineering 
Research Centres: Temasek Laboratories 
Rights: © 2012 IEEE.
Fulltext Permission: none
Fulltext Availability: No Fulltext
Appears in Collections:TL Conference Papers

SCOPUSTM   
Citations 50

7
Updated on Mar 7, 2025

Page view(s) 20

798
Updated on Mar 26, 2025

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.