Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/177690
Title: Attention-based sound classification pipeline with sound spectrum
Authors: Tan, Ki In
Yean, Seanglidet
Lee, Bu-Sung
Keywords: Computer and Information Science
Issue Date: 2023
Source: Tan, K. I., Yean, S. & Lee, B. (2023). Attention-based sound classification pipeline with sound spectrum. 2023 IEEE Sensors Applications Symposium (SAS). https://dx.doi.org/10.1109/SAS58821.2023.10254193
Conference: 2023 IEEE Sensors Applications Symposium (SAS)
Abstract: Urban soundscape research and their impact study are gaining more prominence with regard to a livable environment. Machine learning models have been used extensively to classify sounds where the input sound data, commonly in wave form, needs to be collected in its full frequency spectrum. However, in an application like NoiseCapture, the sound spectrum is divided into 23 frequency bands and thus some information or features are lost. Given the recent success in training a deep learning model to classify sounds with a limited sound spectrum, we developed a pipeline for maximizing the performance of sound spectrum input with attention-based model. Using data from ESC-50, we discover that the use of transformers improve accuracy over the conventional neural networks by 22.5%; however the limited frequency bands in NoiseCapture sound spectrum impairs the model accuracy, necessitating the use of data augmentation. The data pipeline is analyzed for our case study of Singapore, where selected sound labels, curated to fit the local context, are used to train the model, resulting in an improvement in base transformer accuracy by 12.7%.
URI: https://hdl.handle.net/10356/177690
ISBN: 9798350323078
DOI: 10.1109/SAS58821.2023.10254193
Schools: College of Computing and Data Science 
School of Computer Science and Engineering 
Rights: © 2023 IEEE. All rights reserved. This article may be downloaded for personal use only. Any other use requires prior permission of the copyright holder. The Version of Record is available online at http://doi.org/10.1109/SAS58821.2023.10254193.
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:CCDS Conference Papers

Files in This Item:
File Description SizeFormat 
Attention-Based_Sound_Classification_Pipeline_with_Sound_Spectrum 1.pdf260.9 kBAdobe PDFView/Open

Page view(s)

97
Updated on Mar 21, 2025

Download(s) 50

51
Updated on Mar 21, 2025

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.