Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/177694
Title: Sound classification using sound spectrum features and convolutional neural networks
Authors: Tan, Ki In
Yean, Seanglidet
Lee, Bu-Sung
Keywords: Computer and Information Science
Issue Date: 2023
Source: Tan, K. I., Yean, S. & Lee, B. (2023). Sound classification using sound spectrum features and convolutional neural networks. 2022 3rd International Conference on Human-Centric Smart Environments for Health and Well-being (IHSH), 94-99. https://dx.doi.org/10.1109/IHSH57076.2022.10092143
Conference: 2022 3rd International Conference on Human-Centric Smart Environments for Health and Well-being (IHSH)
Abstract: This paper proposes an alternative approach to sound classification using sound spectrum features, differing from the use of the Mel-Frequency Cepstral Coefficients (MFCC). Aligning with the crowd sourcing data collection application NoiseCapture, the data are kept in form of the post-processed sound spectrum instead of the raw audio files to maintain privacy of volunteers. Under such circumstances, MFCC, which requires audio processing, cannot be directly obtained from nor maximize the features of sound spectrum data stored in the application. As sound spectrum does not undergo further feature transformation, it retains audio features from the audio file and should therefore be classifiable when passed into a trained sound spectrum model. Hence, in this study, we aim to evaluate whether sound spectrum could be used as a replacement of MFCC, especially when audio file is inaccessible. The UrbanSound8K dataset and a mix of deep learning and machine learning models were used for the comparison. Experiment results show sound spectrum achieving comparable results in Convolutional Neural Network (CNN), with better predictions than its MFCC counterpart. Further comparisons draw insights that illustrate the need for more finetuning for sound spectrum data when using non-CNN models for sound classification due to the shape of the input features.
URI: https://hdl.handle.net/10356/177694
ISBN: 9781665463218
DOI: 10.1109/IHSH57076.2022.10092143
Schools: College of Computing and Data Science 
School of Computer Science and Engineering 
Rights: © 2022 IEEE. All rights reserved. This article may be downloaded for personal use only. Any other use requires prior permission of the copyright holder. The Version of Record is available online at http://doi.org/10.1109/IHSH57076.2022.10092143.
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:CCDS Conference Papers

Files in This Item:
File Description SizeFormat 
Sound_Classification_using_Sound_Spectrum_Features_and_Convolutional_Neural_Networks 1.pdf3.53 MBAdobe PDFView/Open

SCOPUSTM   
Citations 50

4
Updated on Mar 12, 2025

Page view(s)

91
Updated on Mar 18, 2025

Download(s) 50

36
Updated on Mar 18, 2025

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.