Please use this identifier to cite or link to this item:
Title: The NNi Vietnamese speech recognition system for mediaeval 2016
Authors: Xiao, Xiong
Nwe, Tin Lay
Chng, Eng Siong
Ma, Bin
Li, Haizhou
Wang, Lei
Ni, Chongjia
Leung, Cheung-Chi
You, Changhuai
Xie, Lei
Xu, Haihua
Keywords: Vietnamese
DRNTU::Engineering::Computer science and engineering
Issue Date: 2016
Source: Wang, L., Ni, C., Leung, C. -C., You, C., Xie, L., Xu, H., . . . Li, H. (2016). The NNi Vietnamese speech recognition system for mediaeval 2016. Multimedia Benchmark Workshop, 1739.
Abstract: This paper provides an overall description of the Vietnamese speech recognition system developed by the joint team for MediaEval 2016. The submitted system consisted of 3 subsystems, and adopted different deep neural network-based techniques such as fMLLR transformed bottleneck features, sequence training, etc. Besides the acoustic modeling techniques, speech data augmentation was also examined to develop a more robust acoustic model. The I2R team collected a number of text resources from the Internet and made them available to other participants in the task. The web text crawled from the Internet was used to train a 5-gram language model. The submitted system obtained the token error rate (TER) of 15.1, 23.0 and 50.5 on Devel local set, Devel set and Test set, respectively.
Rights: © 2016 The Author(s).
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Conference Papers

Files in This Item:
File Description SizeFormat 
The NNI Vietnamese Speech Recognition System.pdf182.16 kBAdobe PDFThumbnail

Page view(s)

Updated on Apr 17, 2021

Download(s) 50

Updated on Apr 17, 2021

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.