Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/153220
Title: Sound event detection with human and emergency sounds
Authors: Lee, Yan Zhen
Keywords: Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Issue Date: 2021
Publisher: Nanyang Technological University
Source: Lee, Y. Z. (2021). Sound event detection with human and emergency sounds. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/153220
Project: SCSE20-1118
Abstract: Sound Event Detection (SED) is the task of recognizing the sound events and their respective onset and offset timestamps in an audio clip. This thesis explores a variety of models and techniques in order to develop an effective SED system. This includes investigating the impact of different audio feature types, data augmentation techniques, network architectures and automatic threshold optimisation on the performance of the system. Additionally, this thesis proposes frame- wise prediction pre-processing and post-processing methods, in order to address the issues with existing SED system and develop a system that is able analyse clips with long audio durations. Unlike previous works, which use standard datasets, such as those from the Detection and Classification of Acoustic Scenes and Events (DCASE) challenges, as the development dataset, a novel dataset consisting of human and emergency sounds extracted from AudioSet is used in this project. As the dataset is novel, there is no state-of-the-art baseline available for comparison. As such, the dataset of the DCASE 2017 Task 4 is used to compare the performance of our best-performing models, which is determined based on the project dataset, with the state-of-the- art performance. From our experiments, we managed to successfully develop a well-performing SED system for our novel dataset, with the system using our proposed prediction processing method consistently outperforming the ones that do not. Additionally, by using the knowledge we learnt from our experiments with our novel project dataset, we devloped a system which outperforms the previous state- of-the-art model for the DCASE 2017 Task 4 Challenge.
URI: https://hdl.handle.net/10356/153220
Schools: School of Computer Science and Engineering 
Organisations: DSO National Laboratories
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
Lee_Yan_Zhen_FYP.pdf
  Restricted Access
2.76 MBAdobe PDFView/Open

Page view(s)

445
Updated on May 7, 2025

Download(s)

18
Updated on May 7, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.