Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/183980
Title: | Hardware constrained deep learning: an empirical analysis of dynamic quantisation across computer vision and natural language processing domains | Authors: | Sai, Shein Htet | Keywords: | Computer and Information Science | Issue Date: | 2025 | Publisher: | Nanyang Technological University | Source: | Sai, S. H. (2025). Hardware constrained deep learning: an empirical analysis of dynamic quantisation across computer vision and natural language processing domains. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/183980 | Abstract: | This study provides a comprehensive analysis of deep learning model optimisation techniques for resource-constrained edge devices, with a focus on the Raspberry Pi platform. The research evaluates three implementation approaches—PyTorch, ONNX conversion, and dynamic post-training quantisation on ONNX — across diverse deep learning architectures in both computer vision (CV) and natural language processing (NLP) domains. Through systematic benchmarking of performance metrics including model size, accuracy, inference speed, memory utilisation, and thermal characteristics, the study reveals that optimisation effectiveness varies dramatically across architectural paradigms. In the CV domain, while ResNet50 demonstrated remarkable resilience to quantisation, maintaining accuracy while achieving 75% size reduction, while efficiencyfocused architectures like EfficientNet experienced significant accuracy collapse. Similarly, in the NLP domain, DistilBERT exhibited strong quantisation resilience with only a 12% relative accuracy drop, while SqueezeBERT suffered a significant 36% decline to near-random performance. The research also identifies a significant memory utilisation paradox across both domains, where some quantized models consumed more runtime memory despite reduced model sizes. ONNX conversion emerged as a universally beneficial strategy in both CV and NLP models, improving inference speed by 30-48% without compromising accuracy. These findings highlight the critical importance of architecture-specific optimisation approaches rather than one-size-fits-all strategies for edge deployment, providing practical guidelines for balancing the competing demands of model capability and deployment feasibility on resource-constrained devices. | URI: | https://hdl.handle.net/10356/183980 | Schools: | College of Computing and Data Science | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | CCDS Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FYP_FINAL_Sai_Shein Htet.pdf Restricted Access | 1.01 MB | Adobe PDF | View/Open |
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.