Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/98873
Title: Combined articulatory and auditory processing for improved speech recognition
Authors: Huang, Guangpu
Er, Meng Joo
Keywords: DRNTU::Engineering::Electrical and electronic engineering
Issue Date: 2011
Abstract: In this paper, we examined the feasibility of articulatory phonetic inversion (API) conditioned on the auditory qualities for improved speech recognition. And we introduced an efficient data-driven heuristic learning algorithm to capture the articulatory-phonetic features (APFs) of English speech. Then we reported the performance of the combined auditory and articulatory processing methods in the inversion and recognition experiments. Firstly, at the front end, the auditory based bark-frequency cepstral coefficient (BFCC) obtained equivalent or higher accuracy compared to the mel-frequency cepstral coefficient (MFCC). Secondly, the use of APFs also significantly altered the phoneme error patterns compared to the purely acoustic features, and they displayed advantages over the canonical pseudo-articulatory features (PAFs) which are manually derived from the phonological rules. The observations support our view that the combinational use of auditory and articulatory cues is beneficial for speech pattern classification. And the proposed neural based API model qualifies as a competitive candidate for profound phoneme recognition with salient features such as generality and portability.
URI: https://hdl.handle.net/10356/98873
http://hdl.handle.net/10220/12782
DOI: http://dx.doi.org/10.1109/ICIEA.2012.6360864
metadata.item.grantfulltext: none
metadata.item.fulltext: No Fulltext
Appears in Collections:EEE Conference Papers

Google ScholarTM

Check

Altmetric

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.