Please use this identifier to cite or link to this item:
|Title:||Combined articulatory and auditory processing for improved speech recognition||Authors:||Huang, Guangpu
Er, Meng Joo
|Keywords:||DRNTU::Engineering::Electrical and electronic engineering||Issue Date:||2011||Abstract:||In this paper, we examined the feasibility of articulatory phonetic inversion (API) conditioned on the auditory qualities for improved speech recognition. And we introduced an efficient data-driven heuristic learning algorithm to capture the articulatory-phonetic features (APFs) of English speech. Then we reported the performance of the combined auditory and articulatory processing methods in the inversion and recognition experiments. Firstly, at the front end, the auditory based bark-frequency cepstral coefficient (BFCC) obtained equivalent or higher accuracy compared to the mel-frequency cepstral coefficient (MFCC). Secondly, the use of APFs also significantly altered the phoneme error patterns compared to the purely acoustic features, and they displayed advantages over the canonical pseudo-articulatory features (PAFs) which are manually derived from the phonological rules. The observations support our view that the combinational use of auditory and articulatory cues is beneficial for speech pattern classification. And the proposed neural based API model qualifies as a competitive candidate for profound phoneme recognition with salient features such as generality and portability.||URI:||https://hdl.handle.net/10356/98873
|DOI:||10.1109/ICIEA.2012.6360864||Fulltext Permission:||none||Fulltext Availability:||No Fulltext|
|Appears in Collections:||EEE Conference Papers|
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.