Please use this identifier to cite or link to this item:
Title: An exemplar-based approach to frequency warping for voice conversion
Authors: Tian, Xiaohai
Lee, Siu Wa
Wu, Zhizheng
Chng, Eng Siong
Li, Haizhou
Keywords: Exemplar
DRNTU::Engineering::Computer science and engineering
Voice Conversion
Issue Date: 2017
Source: Tian, X., Lee, S. W., Wu, Z., Chng, E. S., & Li, H. (2017). An exemplar-based approach to frequency warping for voice conversion. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(10), 1863-1876. doi:10.1109/TASLP.2017.2723721
Series/Report no.: IEEE/ACM Transactions on Audio, Speech, and Language Processing
Abstract: The voice conversion’s task is to modify a source speaker’s voice to sound like that of a target speaker. A conversion method is considered successful when the produced speech sounds natural and similar to the target speaker. This paper presents a new voice conversion framework in which we combine frequency warping and exemplar-based method for voice conversion. Our method maintains high-resolution details during conversion by directly applying frequency warping on the high-resolution spectrum to represent the target. The warping function is generated by a sparse interpolation from a dictionary of exemplar warping functions. As the generated warping function is dependent only on a very small set of exemplars, we do away with the statistical averaging effects inherited from Gaussian mixture models (GMM). To compensate for the conversion error, we also apply residual exemplars into the conversion process. Both objective and subjective evaluations on the VOICES database validated the effectiveness of the proposed voice conversion framework. We observed a significant improvement in speech quality over the state-of-the-art parametric methods.
ISSN: 2329-9290
Rights: © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [].
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Journal Articles

Files in This Item:
File Description SizeFormat 
FINAL VERSION.pdf2.61 MBAdobe PDFThumbnail

Google ScholarTM



Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.