Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/89639
Title: | Spoofing speech detection using temporal convolutional neural network | Authors: | Xiao, Xiong Li, Haizhou Tian, Xiaohai Chng, Eng Siong |
Keywords: | DRNTU::Engineering::Computer science and engineering Convolutional Neural Network (CNN) Speech Detection |
Issue Date: | 2016 | Source: | Tian, X., Xiao, X., Chng, E. S., & Li, H. (2016). Spoofing speech detection using temporal convolutional neural network. 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA). doi:10.1109/APSIPA.2016.7820738 | Conference: | 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) | Abstract: | Spoofing speech detection aims to differentiate spoofing speech from natural speech. Frame-based features are usually used in most of previous works. Although multiple frames or dynamic features are used to form a super-vector to represent the temporal information, the time span covered by these features are not sufficient. Most of the systems failed to detect the non-vocoder or unit selection based spoofing attacks. In this work, we propose to use a temporal convolutional neural network (CNN) based classifier for spoofing speech detection. The temporal CNN first convolves the feature trajectories with a set of filters, then extract the maximum responses of these filters within a time window using a max-pooling layer. Due to the use of max-pooling, we can extract useful information from a long temporal span without concatenating a large number of neighbouring frames, as in feedforward deep neural network (DNN). Five types of feature are employed to access the performance of proposed classifier. Experimental results on ASVspoof 2015 corpus show that the temporal CNN based classifier is effective for synthetic speech detection. Specifically, the proposed method brings a significant performance boost for the unit selection based spoofing speech detection. | URI: | https://hdl.handle.net/10356/89639 http://hdl.handle.net/10220/47064 |
DOI: | 10.1109/APSIPA.2016.7820738 | Schools: | School of Computer Science and Engineering | Research Centres: | NTU-UBC Research Centre of Excellence in Active Living for the Elderly Temasek Laboratories |
Rights: | © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/APSIPA.2016.7820738]. | Fulltext Permission: | open | Fulltext Availability: | With Fulltext |
Appears in Collections: | SCSE Conference Papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
APSIPA_CNN_ASV.pdf | 800.84 kB | Adobe PDF | ![]() View/Open |
SCOPUSTM
Citations
20
18
Updated on Feb 11, 2025
Page view(s) 50
501
Updated on Mar 17, 2025
Download(s) 20
222
Updated on Mar 17, 2025
Google ScholarTM
Check
Altmetric
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.