Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/163693
Title: Surveillance of sound environment by machine learning
Authors: Yu, Xiang
Keywords: Engineering::Electrical and electronic engineering
Issue Date: 2022
Publisher: Nanyang Technological University
Source: Yu, X. (2022). Surveillance of sound environment by machine learning. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/163693
Project: P3012-211
Abstract: Environmental sound recognition and classification is an important topic in the field of sound event study. Computer can be used to simulate the way that human ear's hearing function works to recognize transient sound signal and assign with corresponding category label. Environmental sounds contain a lot of key information, acoustic scene classification and sound event detection are important technologies of natural acoustic scene calculation and analysis, which will be an essential part for modern applications such as smart robot, airport noise monitoring, unmanned driving, public security intelligent surveillance, etc. At present, the tasks of ambient sound recognition pose many challenges. On one hand, unlike speech and music, ambient sound has complex and changeable frequency domain features and time domain structures, especially in the scene with multiple sound events. In terms of frequency domain features, a sound pitch may have distinct peaks in the frequency spectrum such as an impact sound, or it may be with frequency distribution across the whole spectrum like the wind or noise. In terms of time domain structure, sound can be transient, continuous or intermittent. Therefore it is important and challenging to design a sound recognition system according to the various features of environment sounds, and how to make the computer perceive and understand the acoustic scene exactly like the human ear is a research hotspot in the field of audio signals processing. On the other hand, dataset of environment sound event from open source is very limited, how to make use of the limited dataset to ensure the model with accurate and effective performance is also important. Using the spectrogram, a sound signal can be visualized and quantified with a time-frequency spectral analysis of the magnitude spectrum in a 2D plane. This poses a challenge to sound event classification as spectral amplitudes alone are not sufficient for sound classification. In this project, a process called "Regularized 2D complex-log-Fourier transform" was introduced to resolve this problem. This method was first proposed by Professor Jiang Xudong and Professor Ren Jianfeng, which involves analyzing phase spectrum signal and amplitude spectrum signal for sound events classification. The "Principal Component Analysis" PCA was introduced to remove all the unnecessary sound features from the samples. Finally, the "Mahalanobis Distance" MD is also introduced and calculated for the sound class identification.
URI: https://hdl.handle.net/10356/163693
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:EEE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
4_FYP_Final_Report-YuXiang_v4.pdf
  Restricted Access
FYP Final Report1.4 MBAdobe PDFView/Open

Page view(s)

46
Updated on Feb 4, 2023

Download(s)

7
Updated on Feb 4, 2023

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.