Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/89598
Title: Correlation-based frequency warping for voice conversion
Authors: Tian, Xiaohai
Wu, Zhizheng
Lee, Siu-Wa
Chng, Eng Siong
Keywords: DRNTU::Engineering::Computer science and engineering
Speech Synthesis
Voice Conversion
Issue Date: 2014
Source: Tian, X., Wu, Z., Lee, S.-W., & Chng, E. S. (2014). Correlation-based frequency warping for voice conversion. The 9th International Symposium on Chinese Spoken Language Processing, 211-215. doi:10.1109/ISCSLP.2014.6936725
Conference: The 9th International Symposium on Chinese Spoken Language Processing
Abstract: Frequency warping (FW) based voice conversion aims to modify the frequency axis of source spectra towards that of the target. In previous works, the optimal warping function was calculated by minimizing the spectral distance of converted and target spectra without considering the spectral shape. Nevertheless, speaker timbre and identity greatly depend on vocal tract peaks and valleys of spectrum. In this paper, we propose a method to define the warping function by maximizing the correlation between the converted and target spectra. Different from the conventional warping methods, the correlation-based optimization is not determined by the magnitude of the spectra. Instead, both spectral peaks and valleys are considered in the optimization process, which also improves the performance of amplitude scaling. Experiments were conducted on VOICES database, and the results show that after amplitude scaling our proposed method reduced the mel-spectral distortion from 5.85 dB to 5.60 dB. The subjective listening tests also confirmed the effectiveness of the proposed method.
URI: https://hdl.handle.net/10356/89598
http://hdl.handle.net/10220/47053
DOI: 10.1109/ISCSLP.2014.6936725
Schools: School of Computer Science and Engineering 
Research Centres: NTU-UBC Research Centre of Excellence in Active Living for the Elderly 
Rights: © 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/ISCSLP.2014.6936725].
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Conference Papers

Files in This Item:
File Description SizeFormat 
2014_CFW_Xiaohai.pdf193.96 kBAdobe PDFThumbnail
View/Open

SCOPUSTM   
Citations 20

22
Updated on Mar 6, 2024

Page view(s) 50

540
Updated on Mar 26, 2024

Download(s) 50

166
Updated on Mar 26, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.