Custom FPGA-based soft-processors for sparse graph acceleration
Date of Issue2015
2015 IEEE 26th International Conference on Application-specific Systems, Architectures and Processors (ASAP)
School of Computer Engineering
FPGA-based soft processors customized for operations on sparse graphs can deliver significant performance improvements over conventional organizations (ARMv7 CPUs) for bulk synchronous sparse graph algorithms. We develop a stripped-down soft processor ISA to implement specific repetitive operations on graph nodes and edges that are commonly observed in sparse graph computations. In the processing core, we provide hardware support for rapidly fetching and processing state of local graph nodes and edges through spatial address generators and zero-overhead loop iterators. We interconnect a 2D array of these lightweight processors with a packet-switched network-on-chip to enable fine-grained operand routing along the graph edges and provide custom send/receive instructions in the soft processor. We develop the processor RTL using Vivado High-Level Synthesis and also provide an assembler and compilation flow to configure the processor instruction and data memories. We outperform a Microblaze (100MHz on Zedboard) and an NIOS-II/f (100MHz on DE2-115) by 6× (single processor design) as well as the ARMv7 dual-core CPU on the Zynq SoCs by as much as 10× on the Xilinx ZC706 board (100 processor design) across a range of matrix datasets.
Computer Science and Engineering
© 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/ASAP.2015.7245698].