Please use this identifier to cite or link to this item:
Title: Aiding therapy using speech emotion recognition
Authors: Koh, En Rong
Keywords: Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Issue Date: 2021
Publisher: Nanyang Technological University
Source: Koh, E. R. (2021). Aiding therapy using speech emotion recognition. Final Year Project (FYP), Nanyang Technological University, Singapore.
Project: CZ4079
Abstract: In the past 20 years, mental health has come to light within society. The stigma surrounding mental illness is declining thanks to the increasing awareness and encouragement through social media and digital platforms. The growth in psychologists and therapists can also be seen in recent years. Not only did the mental health industry has an increase in patients and counsellors, the advancement of technology integrating with this field is visible in the present day of mental health care. It brought a significant impact on aiding individuals deprived during this period of time. Research on artificial intelligence also improved the quality of therapy, bringing it closer to people who are struggling and taking over virtually. Nonetheless, the applications need to be carefully designed and balanced against their limitations, depending on different mental illnesses. While different kinds of AI have been assisting in the mental health field, such as therapy chatbots and virtual therapists, a lack of recognizing human emotions can be commonly seen in AI systems, especially through speech. Speech Emotion Recognition became a research topic in a wide range of applications and became a challenge in speech processing. In this project, an AI Speech Emotion Recognition system is experimented with using Deep Learning techniques to alternative traditional methods like Support Vector Machine or Hidden Markov Model. We will explore the use of a Convolutional Neural network, a type of Deep Learning method, to train and predict human emotions. We will also examine the different types of time-frequency features in audio signal processing and how they help in classifying human emotion. A SER system with a visual modality will also be developed to test on real-time prediction.
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
  Restricted Access
2.09 MBAdobe PDFView/Open

Page view(s)

Updated on Dec 9, 2022

Download(s) 50

Updated on Dec 9, 2022

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.