Please use this identifier to cite or link to this item:
Title: Analysis and application of speech adaptation on avatar system
Authors: Luo, Fei
Keywords: Engineering::Electrical and electronic engineering
Issue Date: 2019
Abstract: Schizophrenia is a type of chronic and severe mental disorder, which is affecting an increasing number of people all over the world. The clinical diagnosis and assessment of mentally ill patients are subjective, leading to a significant need of training new psychiatrists in a more objective way. Hence, we are aiming to create a virtual robot with schizophrenic symptoms to provide a more objective overview of schizophrenic patients, which can further be used to coach psychiatrists on how to have more productive interactions with patients with schizophrenia. The speech, movement, facial expression, posture and memory of current virtual robot need to be improved. In this dissertation, I focused on analyzing speech adaptation features from the recordings of the clinical interviews and then built the pipeline to implement speech adaptation on avatar platform. We have audio recordings of 75 interviews where 50 of them are between psychiatrists and schizophrenic patients and 25 of them are between psychiatrists and healthy individuals. Next, three low-level speech features, namely pitch, speech rate, and loudness, are extracted from both participant channel and psychiatrist channel. Then, I utilized Granger causality test (GCT) to test whether participants' speech is influenced by psychiatrists' voice and also applied Gaussian Mixture Model (GMM) to generate the distribution of pitch, speech rate and loudness of schizophrenic patients and healthy individuals respectively. Then, I built a schizophrenic model and a healthy model to change the pitch, speech rate and loudness settings of the text-to-speech engine on the virtual human platform. After the implementation, the virtual human is able to dynamically adapt her speech in pitch, speech rate and loudness based on the previous conversation. In addition, multilayer perceptron (MLP) neural network is discussed in this dissertation, which provides an idea to solve this kind of Input-Output fitting problem with a neural network.
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:EEE Theses

Files in This Item:
File Description SizeFormat 
LUO FEI_Dissertation_final.pdf
  Restricted Access
Main article2.32 MBAdobe PDFView/Open

Page view(s)

Updated on May 11, 2021


Updated on May 11, 2021

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.