Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/82372
Full metadata record
DC FieldValueLanguage
dc.contributor.authorXiao, Xiongen
dc.contributor.authorZhao, Shengkuien
dc.contributor.authorNguyen, Duc Hoang Haen
dc.contributor.authorZhong, Xionghuen
dc.contributor.authorJones, Douglas L.en
dc.contributor.authorChng, Eng Siongen
dc.contributor.authorLi, Haizhouen
dc.date.accessioned2016-02-03T08:22:27Zen
dc.date.accessioned2019-12-06T14:54:21Z-
dc.date.available2016-02-03T08:22:27Zen
dc.date.available2019-12-06T14:54:21Z-
dc.date.issued2016en
dc.identifier.citationXiao, X., Zhao, S., Nguyen, D. H. H., Zhong, X., Jones, D. L., Chng, E. S., et al. (2016). Speech dereverberation for enhancement and recognition using dynamic features constrained deep neural networks and feature adaptation. EURASIP Journal on Advances in Signal Processing, 2016, 4-.en
dc.identifier.issn1687-6172en
dc.identifier.urihttps://hdl.handle.net/10356/82372-
dc.description.abstractThis paper investigates deep neural networks (DNN) based on nonlinear feature mapping and statistical linear feature adaptation approaches for reducing reverberation in speech signals. In the nonlinear feature mapping approach, DNN is trained from parallel clean/distorted speech corpus to map reverberant and noisy speech coefficients (such as log magnitude spectrum) to the underlying clean speech coefficients. The constraint imposed by dynamic features (i.e., the time derivatives of the speech coefficients) are used to enhance the smoothness of predicted coefficient trajectories in two ways. One is to obtain the enhanced speech coefficients with a least square estimation from the coefficients and dynamic features predicted by DNN. The other is to incorporate the constraint of dynamic features directly into the DNN training process using a sequential cost function. In the linear feature adaptation approach, a sparse linear transform, called cross transform, is used to transform multiple frames of speech coefficients to a new feature space. The transform is estimated to maximize the likelihood of the transformed coefficients given a model of clean speech coefficients. Unlike the DNN approach, no parallel corpus is used and no assumption on distortion types is made. The two approaches are evaluated on the REVERB Challenge 2014 tasks. Both speech enhancement and automatic speech recognition (ASR) results show that the DNN-based mappings significantly reduce the reverberation in speech and improve both speech quality and ASR performance. For the speech enhancement task, the proposed dynamic feature constraint help to improve cepstral distance, frequency-weighted segmental signal-to-noise ratio (SNR), and log likelihood ratio metrics while moderately degrades the speech-to-reverberation modulation energy ratio. In addition, the cross transform feature adaptation improves the ASR performance significantly for clean-condition trained acoustic models.en
dc.format.extent18 p.en
dc.language.isoenen
dc.relation.ispartofseriesEURASIP Journal on Advances in Signal Processingen
dc.rights© 2016 Xiao et al. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.en
dc.subjectSpeech enhancementen
dc.subjectDeep neural networksen
dc.subjectDynamic featuresen
dc.subjectFeature adaptationen
dc.subjectRobust speech recognitionen
dc.subjectReverberation challengeen
dc.subjectBeamformingen
dc.titleSpeech dereverberation for enhancement and recognition using dynamic features constrained deep neural networks and feature adaptationen
dc.typeJournal Articleen
dc.contributor.schoolSchool of Computer Engineeringen
dc.contributor.researchTemasek Laboratoriesen
dc.identifier.doi10.1186/s13634-015-0300-4en
dc.description.versionPublished versionen
item.grantfulltextopen-
item.fulltextWith Fulltext-
Appears in Collections:SCSE Journal Articles
TL Journal Articles

SCOPUSTM   
Citations 10

35
Updated on Jul 16, 2024

Web of ScienceTM
Citations 10

31
Updated on Oct 26, 2023

Page view(s) 20

706
Updated on Jul 23, 2024

Download(s) 20

251
Updated on Jul 23, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.