Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/89598
Title: | Correlation-based frequency warping for voice conversion | Authors: | Tian, Xiaohai Wu, Zhizheng Lee, Siu-Wa Chng, Eng Siong |
Keywords: | DRNTU::Engineering::Computer science and engineering Speech Synthesis Voice Conversion |
Issue Date: | 2014 | Source: | Tian, X., Wu, Z., Lee, S.-W., & Chng, E. S. (2014). Correlation-based frequency warping for voice conversion. The 9th International Symposium on Chinese Spoken Language Processing, 211-215. doi:10.1109/ISCSLP.2014.6936725 | Conference: | The 9th International Symposium on Chinese Spoken Language Processing | Abstract: | Frequency warping (FW) based voice conversion aims to modify the frequency axis of source spectra towards that of the target. In previous works, the optimal warping function was calculated by minimizing the spectral distance of converted and target spectra without considering the spectral shape. Nevertheless, speaker timbre and identity greatly depend on vocal tract peaks and valleys of spectrum. In this paper, we propose a method to define the warping function by maximizing the correlation between the converted and target spectra. Different from the conventional warping methods, the correlation-based optimization is not determined by the magnitude of the spectra. Instead, both spectral peaks and valleys are considered in the optimization process, which also improves the performance of amplitude scaling. Experiments were conducted on VOICES database, and the results show that after amplitude scaling our proposed method reduced the mel-spectral distortion from 5.85 dB to 5.60 dB. The subjective listening tests also confirmed the effectiveness of the proposed method. | URI: | https://hdl.handle.net/10356/89598 http://hdl.handle.net/10220/47053 |
DOI: | 10.1109/ISCSLP.2014.6936725 | Schools: | School of Computer Science and Engineering | Research Centres: | NTU-UBC Research Centre of Excellence in Active Living for the Elderly | Rights: | © 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/ISCSLP.2014.6936725]. | Fulltext Permission: | open | Fulltext Availability: | With Fulltext |
Appears in Collections: | SCSE Conference Papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2014_CFW_Xiaohai.pdf | 193.96 kB | Adobe PDF | ![]() View/Open |
SCOPUSTM
Citations
20
22
Updated on Mar 15, 2025
Page view(s) 50
589
Updated on Mar 19, 2025
Download(s) 50
209
Updated on Mar 19, 2025
Google ScholarTM
Check
Altmetric
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.