Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/83649
Title: Zedwulf: Power-Performance Tradeoffs of a 32-Node Zynq SoC Cluster
Authors: Moorthy, Pradeep
Kapre, Nachiket
Keywords: Computer Science and Engineering
Issue Date: 2015
Source: Moorthy, P., & Kapre, N. (2015). Zedwulf: Power-Performance Tradeoffs of a 32-Node Zynq SoC Cluster. 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines, 68-75.
Conference: 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)
Abstract: Commodity SoCs with hybrid architectures that combine CPUs with programmable FPGA fabric such as the Xilinx Zynq SoC have become a competitive energy-efficient platform for addressing irregular parallelism in graph problems. In this paper, we prototype a 32-node cluster composed from these Zynq SoC chips to accelerate communication-bound sparse graph-oriented applications such as neural network simulations. We develop specialized MPI routines specifically developed for irregular accelerator-to-accelerator communication of small message traffic. We use the ARM processor for handling the MPI stack while offloading compute-intensive calculations to the FPGA. For graphs with 32M nodes and 32M edges, Zedwulf delivers the highest 94 MTEPS (Million Traversed Edges Per Second)throughput over other x86 multi-threaded platforms in our study by 1.2 -- 1.4×. For this experiment, Zedwulf operates at an efficiency of 0.49 MTEPS/W when using ARM+FPGA which is1.2× better than using ARMv7 CPUs alone, and within 8% of the Intel Core i7-4770k platform.
URI: https://hdl.handle.net/10356/83649
http://hdl.handle.net/10220/39205
DOI: 10.1109/FCCM.2015.37
Schools: School of Computer Engineering 
Rights: © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/FCCM.2015.37].
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Conference Papers

Files in This Item:
File Description SizeFormat 
Zedwulf_Power-Performance Tradeoffs of a 32-node Zynq SoC cluster.pdf847.4 kBAdobe PDFThumbnail
View/Open

SCOPUSTM   
Citations 20

29
Updated on Jun 9, 2024

Web of ScienceTM
Citations 20

21
Updated on Oct 25, 2023

Page view(s) 50

523
Updated on Jun 11, 2024

Download(s) 20

275
Updated on Jun 11, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.