Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/145503
Title: XOR-Net : an efficient computation pipeline for binary neural network inference on edge devices
Authors: Zhu, Shien
Duong, Luan H. K.
Liu, Weichen
Keywords: Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Issue Date: 2020
Source: Zhu, S., Duong, L. H. K., & Liu, W. (2020). XOR-Net : an efficient computation pipeline for binary neural network inference on edge devices. Proceedings of the International Conference on Parallel and Distributed Systems (ICPADS).
Project: Academic Research Fund Tier 1 (MOE2019-T1- 001-072), Ministry of Education, Singapore
Academic Research Fund Tier 2 (MOE2019-T2-1-071), Ministry of Education, Singapore
NAP (M4082282), Nanyang Technological University, Singapore
SUG (M4082087), Nanyang Technological University, Singapore
Abstract: Accelerating the inference of Convolution Neural Networks (CNNs) on edge devices is essential due to the small memory size and poor computation capability of these devices. Network quantization methods such as XNOR-Net, Bi-Real-Net, and XNOR-Net++ reduce the memory usage of CNNs by binarizing the CNNs. They also simplify the multiplication operations to bit-wise operations and obtain good speedup on edge devices. However, there are hidden redundancies in the computation pipeline of these methods, constraining the speedup of those binarized CNNs. In this paper, we propose XOR-Net as an optimized computation pipeline for binary networks both without and with scaling factors. As XNOR is realized by two instructions XOR and NOT on CPU/GPU platforms, XOR-Net avoids NOT operations by using XOR instead of XNOR, thus reduces bit-wise operations in both aforementioned kinds of binary convolution layers. For the binary convolution with scaling factors, our XOR-Net further rearranges the computation sequence of calculating and multiplying the scaling factors to reduce full-precision operations. Theoretical analysis shows that XOR-Net reduces one-third of the bit-wise operations compared with traditional binary convolution, and up to 40\% of the full-precision operations compared with XNOR-Net. Experimental results show that our XOR-Net binary convolution without scaling factors achieves up to 135X speedup and consumes no more than 0.8% energy compared with parallel full-precision convolution. For the binary convolution with scaling factors, XOR-Net is up to 17% faster and 19% more energy-efficient than XNOR-Net.
URI: https://hdl.handle.net/10356/145503
DOI (Related Dataset): https://doi.org/10.21979/N9/XEH3D1
Rights: © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Conference Papers

Files in This Item:
File Description SizeFormat 
ICPADS_2020_Optimized_XNOR_Net The accepted version.pdf1.03 MBAdobe PDFView/Open

Page view(s)

31
Updated on Apr 17, 2021

Download(s) 50

87
Updated on Apr 17, 2021

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.