Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/174476
Title: | Fully decoupled neural network learning using delayed gradients | Authors: | Zhuang, Huiping Wang, Yi Liu, Qinglai Lin, Zhiping |
Keywords: | Computer and Information Science | Issue Date: | 2021 | Source: | Zhuang, H., Wang, Y., Liu, Q. & Lin, Z. (2021). Fully decoupled neural network learning using delayed gradients. IEEE Transactions On Neural Networks and Learning Systems, 33(10), 6013-6020. https://dx.doi.org/10.1109/TNNLS.2021.3069883 | Project: | NRP-1922500054. | Journal: | IEEE Transactions on Neural Networks and Learning Systems | Abstract: | Training neural networks with back-propagation (BP) requires a sequential passing of activations and gradients. This has been recognized as the lockings (i.e., the forward, backward, and update lockings) among modules (each module contains a stack of layers) inherited from the BP. In this paper, we propose a fully decoupled training scheme using delayed gradients (FDG) to break all these lockings. The FDG splits a neural network into multiple modules and trains them independently and asynchronously using different workers (e.g., GPUs). We also introduce a gradient shrinking process to reduce the stale gradient effect caused by the delayed gradients. Our theoretical proofs show that the FDG can converge to critical points under certain conditions. Experiments are conducted by training deep convolutional neural networks to perform classification tasks on several benchmark datasets. These experiments show comparable or better results of our approach compared with the state-of-theart methods in terms of generalization and acceleration. We also show that the FDG is able to train various networks, including extremely deep ones (e.g., ResNet-1202), in a decoupled fashion. | URI: | https://hdl.handle.net/10356/174476 | ISSN: | 2162-237X | DOI: | 10.1109/TNNLS.2021.3069883 | Schools: | School of Electrical and Electronic Engineering | Research Centres: | Temasek Laboratories @ NTU | Rights: | © 2021 IEEE. All rights reserved. This article may be downloaded for personal use only. Any other use requires prior permission of the copyright holder. The Version of Record is available online at http://doi.org/10.1109/TNNLS.2021.3069883. | Fulltext Permission: | open | Fulltext Availability: | With Fulltext |
Appears in Collections: | EEE Journal Articles |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Fully Decoupled Neural Network Learning Using Delayed Gradients.pdf | 601.87 kB | Adobe PDF | ![]() View/Open |
SCOPUSTM
Citations
20
12
Updated on Mar 13, 2025
Page view(s)
87
Updated on Mar 22, 2025
Download(s) 50
67
Updated on Mar 22, 2025
Google ScholarTM
Check
Altmetric
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.