Please use this identifier to cite or link to this item:
Title: The NNi Vietnamese speech recognition system for mediaeval 2016
Authors: Xiao, Xiong
Nwe, Tin Lay
Chng, Eng Siong
Ma, Bin
Li, Haizhou
Wang, Lei
Ni, Chongjia
Leung, Cheung-Chi
You, Changhuai
Xie, Lei
Xu, Haihua
Keywords: Vietnamese
DRNTU::Engineering::Computer science and engineering
Issue Date: 2016
Source: Wang, L., Ni, C., Leung, C. -C., You, C., Xie, L., Xu, H., . . . Li, H. (2016). The NNi Vietnamese speech recognition system for mediaeval 2016. Multimedia Benchmark Workshop, 1739.
Conference: Multimedia Benchmark Workshop
Abstract: This paper provides an overall description of the Vietnamese speech recognition system developed by the joint team for MediaEval 2016. The submitted system consisted of 3 subsystems, and adopted different deep neural network-based techniques such as fMLLR transformed bottleneck features, sequence training, etc. Besides the acoustic modeling techniques, speech data augmentation was also examined to develop a more robust acoustic model. The I2R team collected a number of text resources from the Internet and made them available to other participants in the task. The web text crawled from the Internet was used to train a 5-gram language model. The submitted system obtained the token error rate (TER) of 15.1, 23.0 and 50.5 on Devel local set, Devel set and Test set, respectively.
Schools: School of Computer Science and Engineering 
Rights: © 2016 The Author(s).
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Conference Papers

Files in This Item:
File Description SizeFormat 
The NNI Vietnamese Speech Recognition System.pdf182.16 kBAdobe PDFThumbnail

Page view(s) 50

Updated on Jun 20, 2024

Download(s) 50

Updated on Jun 20, 2024

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.