Please use this identifier to cite or link to this item:
Title: Cognitive-inspired speaker affective state profiling
Authors: Norhaslinda Kamaruddin
Keywords: DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
DRNTU::Engineering::Computer science and engineering::Computer applications::Social and behavioral sciences
Issue Date: 2012
Source: Kamaruddin, N. (2012). Cognitive-inspired speaker affective state profiling. Doctoral thesis, Nanyang Technological University, Singapore.
Abstract: Human behavior is influenced by emotion and human expressed affective state through numerous channels using non-verbal communication; namely: facial expression, gestures, eye-gazing, body postures, as well as verbal communication. In verbal communication itself, there are lots of underlying information transmitted using acoustical features and the semantic meaning of the word/sentence used. Despite the evident complexity of such interaction, listener still can correctly perceive the propagated emotion conveyed by the interlocutor. This is due to the human cognitive functional ability to dissect and infer the information with high accuracy and then react accordingly with appropriate behavioral responses and feedbacks. Hence, this research work introduces novel technique in discriminating emotion to facilitate the understanding of speaker affective state, based on the hypothesis that emotion is propagated through speech and it can be quantified. Speech emotion is a growing multi-disciplinary research field and is gaining greater momentum due to the increased need to improve on the quality of human computer interaction. Numerous researchers apply various feature extraction methods coupled with classifiers to produce acceptable accuracy performance. Nonetheless, the performance of such a system is bound to cultural influence which resulted in unpromising outcome once an unknown culture-influenced speech is introduced. Culture is always regarded as a trivial and inconsequential parameter that heeds minimal consideration in speech emotion recognition. Hence, in this work, the intricate relationship of cultural influence in term of intra-cultural and inter-cultural effects is studied in details. Two speech emotion datasets; of the NTU_American and NTU_Asian dataset representing the American and Asian culture influence to speech emotion respectively were collected and together with the standard Berlin speech emotion dataset were used to understand the speech emotion recognition system and the culture bias. The work is then extended to investigate speaker affective state profiling using the Valence-Arousal (VA) analysis approach that enables visualization tool to be utilized for intra-cultural and inter-cultural assessments. The strength of this VA approach is that it is able to facilitate the observation of new finding as well as catering to dynamic data-driven affective space model generation that is able to empirically verify the psychologists’ agreement of the affective space model. This proposed approach is developed to complement the discrete-class classification system that is rigid and lacking the explainable components. The result shows huge potential for future practical applications of such analysis system; which enables researchers, engineers, scientists, psychologists, medical practitioner as well as intelligent system developer to visualize emotions from a common view point.
DOI: 10.32657/10356/51057
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Theses

Files in This Item:
File Description SizeFormat 
TsceG0602865F.pdf11.16 MBAdobe PDFThumbnail

Page view(s) 50

Updated on Jul 27, 2021

Download(s) 20

Updated on Jul 27, 2021

Google ScholarTM




Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.