Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/182482
Title: Regularization of deep neural network using a multisample memory model
Authors: Tanveer, Muhammad
Siyal, Mohammad Yakoob
Rashid, Sheikh Faisal
Keywords: Engineering
Issue Date: 2024
Source: Tanveer, M., Siyal, M. Y. & Rashid, S. F. (2024). Regularization of deep neural network using a multisample memory model. Neural Computing and Applications, 36(36), 23295-23307. https://dx.doi.org/10.1007/s00521-024-10474-x
Journal: Neural Computing and Applications
Abstract: Deep convolutional neural networks (CNNs) are widely used in computer vision and have achieved significant performance for image classification tasks. Overfitting is a general problem in deep learning models that inhibit the generalization capability of deep models due to the presence of noise, the limited size of the training data, the complexity of the classifier, and the larger number of hyperparameters involved during training. Several techniques have been developed for overfitting inhibition, but in this research we focus only on regularization techniques. We propose a memory-based regularization technique to inhibit overfitting problems and generalize the performance of deep neural networks. Our backbone architectures receive input samples in bags rather than directly in batches to generate deep features. The proposed model receives input samples as queries and feeds them to the MAM (memory access module), which searches for the relevant items in memory and computes memory loss using Euclidean similarity measures. Our memory loss function incorporates intra-class compactness and inter-class separability at the feature level. Most surprisingly, the convergence rate of the proposed model is superfast, requiring only a few epochs to train both shallow and deeper models. In this study, we evaluate the performance of the memory model across several state-of-the-art (SOTA) deep learning architectures, including ReseNet18, ResNet50, ResNet101, VGG-16, AlexNet, and MobileNet, using the CIFAR-10 and CIFAR-100 datasets. The results show that the efficient memory model we have developed significantly outperforms almost all existing SOTA benchmarks by a considerable margin.
URI: https://hdl.handle.net/10356/182482
ISSN: 0941-0643
DOI: 10.1007/s00521-024-10474-x
Schools: School of Electrical and Electronic Engineering 
Rights: © 2024 The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature.
Fulltext Permission: none
Fulltext Availability: No Fulltext
Appears in Collections:EEE Journal Articles

Page view(s)

22
Updated on Mar 16, 2025

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.