Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/89601
Title: System fusion for high-performance voice conversion
Authors: Tian, Xiaohai
Wu, Zhizheng
Lee, Siu Wa
Nguyen, Quy Hy
Dong, Minghui
Chng, Eng Siong
Keywords: Voice Conversion
System Fusion
Engineering::Computer science and engineering
Issue Date: 2015
Source: Tian, X., Wu, Z., Lee, S. W., Nguyen, Q. H., Dong, M., & Chng, E. S. (2015). System fusion for high-performance voice conversion. Proc. International Conference on Spoken Language Processing (ICSLP), Interspeech 2015 Proceedings, 2759-2763.
Conference: Proc. International Conference on Spoken Language Processing (ICSLP), Interspeech 2015 Proceedings
Abstract: Recently, a number of voice conversion methods have been developed. These methods attempt to improve conversion performance by using diverse mapping techniques in various acoustic domains, e.g. high-resolution spectra and low-resolution Mel-cepstral coefficients. Each individual method has its own pros and cons. In this paper, we introduce a system fusion framework, which leverages and synergizes the merits of these state-of-the-art and even potential future conversion methods. For instance, methods delivering high speech quality are fused with methods capturing speaker characteristics, bringing another level of performance gain. To examine the feasibility of the proposed framework, we select two state-of-the-art methods, Gaussian mixture model and frequency warping based systems, as a case study. Experimental results reveal that the fusion system outperforms each individual method in both objective and subjective evaluation, and demonstrate the effectiveness of the proposed fusion framework.
URI: https://hdl.handle.net/10356/89601
http://hdl.handle.net/10220/49144
Schools: School of Computer Science and Engineering 
Organisations: Joint NTU-UBC Research Center of Excellence in Active Living for the Elderly
Rights: © 2015 International Speech Communication Association (ISCA). All rights reserved. This paper was published in Proc. International Conference on Spoken Language Processing (ICSLP), Interspeech 2015 Proceedings and is made available with permission of International Speech Communication Association (ISCA).
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Conference Papers

Files in This Item:
File Description SizeFormat 
System Fusion for High-Performance Voice Conversion.pdf145.9 kBAdobe PDFThumbnail
View/Open

Page view(s)

410
Updated on May 7, 2025

Download(s) 50

129
Updated on May 7, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.