Please use this identifier to cite or link to this item:
|Title:||Small footprint model for noisy far-field keyword spotting||Authors:||Pang, Jin Hui||Keywords:||Engineering::Computer science and engineering||Issue Date:||2022||Publisher:||Nanyang Technological University||Source:||Pang, J. H. (2022). Small footprint model for noisy far-field keyword spotting. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/158398||Abstract:||Building a small memory footprint keyword spotting model is important as it typically runs on mobile devices with low computational resources. However, it is very challenging to develop a lightweight model and also maintaining a state-of-the-art result under noisy far field environment. In real life, noisy environment with some reverberations is degrading the performance of a keyword spotting model. We explored a variety of baseline models and data processing techniques to make effective predictions for keywords. Additionally, we proposed a novel feature interactive convolution model with small parameters for single-channel and multi-channel utterance. The interactive unit is implemented as the attention mechanism to enhance the flow of information by using less computation resources. Moreover, we proposed a centroid based awareness component to improve the multi-channel system by providing some additional spatial geometry information in the latent feature projection space. Single-channel model was evaluated on Google Speech Command V2-12 dataset whereas multi-channel model was evaluated on MISP Challenge 2021 dataset. Our single-channel model achieves accuracy of 98.2% on original Google Speech Command and outperforms most of the previous small models. Besides, our multi-channel model achieves outstanding improvement against the official competition baseline with a 55% gain in the competition score which is 0.152 on 6-channel audio input and a 63% which is 0.126 boost using traditional front-end speech enhancement.||URI:||https://hdl.handle.net/10356/158398||Fulltext Permission:||restricted||Fulltext Availability:||With Fulltext|
|Appears in Collections:||SCSE Student Reports (FYP/IA/PA/PI)|
Files in This Item:
|2.12 MB||Adobe PDF||View/Open|
Updated on Dec 9, 2022
Updated on Dec 9, 2022
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.