Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/162483
Full metadata record
DC FieldValueLanguage
dc.contributor.authorZhu, Shienen_US
dc.contributor.authorDuong, Luan H. K.en_US
dc.contributor.authorChen, Huien_US
dc.contributor.authorLiu, Dien_US
dc.contributor.authorLiu, Weichenen_US
dc.date.accessioned2022-10-26T04:08:04Z-
dc.date.available2022-10-26T04:08:04Z-
dc.date.issued2022-
dc.identifier.citationZhu, S., Duong, L. H. K., Chen, H., Liu, D. & Liu, W. (2022). FAT: an in-memory accelerator with fast addition for ternary weight neural networks. IEEE Transactions On Computer-Aided Design of Integrated Circuits and Systems. https://dx.doi.org/10.1109/TCAD.2022.3184276en_US
dc.identifier.issn0278-0070en_US
dc.identifier.urihttps://hdl.handle.net/10356/162483-
dc.description.abstractConvolutional Neural Networks (CNNs) demonstrate excellent performance in various applications but have high computational complexity. Quantization is applied to reduce the latency and storage cost of CNNs. Among the quantization methods, Binary and Ternary Weight Networks (BWNs and TWNs) have a unique advantage over 8-bit and 4-bit quantization. They replace the multiplication operations in CNNs with additions, which are favoured on In-Memory-Computing (IMC) devices. IMC acceleration for BWNs has been widely studied. However, though TWNs have higher accuracy and better sparsity than BWNs, IMC acceleration for TWNs has limited research. TWNs on existing IMC devices are inefficient because the sparsity is not well utilized, and the addition operation is not efficient. In this paper, we propose FAT as a novel IMC accelerator for TWNs. First, we propose a Sparse Addition Control Unit, which utilizes the sparsity of TWNs to skip the null operations on zero weights. Second, we propose a fast addition scheme based on the memory Sense Amplifier to avoid the time overhead of both carry propagation and writing back the carry to memory cells. Third, we further propose a Combined-Stationary data mapping to reduce the data movement of activations and weights and increase the parallelism across memory columns. Simulation results show that for addition operations at the Sense Amplifier level, FAT achieves 2.00× speedup, 1.22× power efficiency and 1.22× area efficiency compared with a State-Of-The-Art IMC accelerator ParaPIM. FAT achieves 10.02× speedup and 12.19× energy efficiency compared with ParaPIM on networks with 80% average sparsity.en_US
dc.description.sponsorshipMinistry of Education (MOE)en_US
dc.description.sponsorshipNanyang Technological Universityen_US
dc.language.isoenen_US
dc.relationMOE2019-T2-1-071en_US
dc.relationMOE2019-T1-001-072en_US
dc.relationM4082282en_US
dc.relationM4082087en_US
dc.relation.ispartofIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systemsen_US
dc.relation.uri10.21979/N9/DYKUPVen_US
dc.rights© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/TCAD.2022.3184276.en_US
dc.subjectEngineering::Computer science and engineering::Computing methodologies::Artificial intelligenceen_US
dc.subjectEngineering::Computer science and engineering::Hardware::Arithmetic and logic structuresen_US
dc.subjectEngineering::Computer science and engineering::Computer systems organization::Special-purpose and application-based systemsen_US
dc.titleFAT: an in-memory accelerator with fast addition for ternary weight neural networksen_US
dc.typeJournal Articleen
dc.contributor.schoolSchool of Computer Science and Engineeringen_US
dc.contributor.researchParallel and Distributed Computing Centreen_US
dc.contributor.researchHP-NTU Digital Manufacturing Corporate Laben_US
dc.identifier.doi10.1109/TCAD.2022.3184276-
dc.description.versionSubmitted/Accepted versionen_US
dc.identifier.scopus2-s2.0-85132696672-
dc.subject.keywordsTernary Wight Neural Networken_US
dc.subject.keywordsIn-Memory Computingen_US
dc.subject.keywordsConvolutional Neural Networken_US
dc.subject.keywordsSpin-Transfer Torque Magnetic Random-Access Memoryen_US
dc.description.acknowledgementThis work is partially supported by the Ministry of Education, Singapore, under its Academic Research Fund Tier 2 (MOE2019-T2-1-071) and Tier 1 (MOE2019-T1-001-072), and partially supported by Nanyang Technological University, Singapore, under its NAP (M4082282) and SUG (M4082087).en_US
item.fulltextWith Fulltext-
item.grantfulltextopen-
Appears in Collections:SCSE Journal Articles
Files in This Item:
File Description SizeFormat 
TCAD_2021_FAT_Final_Submitted_Latex 2022-6-14.pdfMain PDF, Accepted Version4.07 MBAdobe PDFThumbnail
View/Open

SCOPUSTM   
Citations 50

2
Updated on Jul 15, 2024

Web of ScienceTM
Citations 50

1
Updated on Oct 27, 2023

Page view(s)

142
Updated on Jul 19, 2024

Download(s) 50

131
Updated on Jul 19, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.