SPICE2: Spatial Processors Interconnected for Concurrent Execution for Accelerating the SPICE Circuit Simulator Using an FPGA
Date of Issue2012
School of Computer Engineering
Spatial processing of sparse, irregular, double-precision floating-point computation using a single field-programmable gate array (FPGA) enables up to an order of magnitude speedup (mean 2.8× speedup) over a conventional microprocessor for the SPICE circuit simulator. We develop a parallel, FPGA-based, heterogeneous architecture customized for accelerating the SPICE simulator to deliver this speedup. To properly parallelize the complete simulator, we decompose SPICE into its three constituent phases-model evaluation, sparse matrix-solve, and iteration control-and customize a spatial architecture for each phase independently. Our heterogeneous FPGA organization mixes very large instruction word, dataflow and streaming architectures into a cohesive, unified design to match the parallel patterns exposed by our programming framework. This FPGA architecture is able to outperform conventional processors due to a combination of factors, including high utilization of statically-scheduled resources, low-overhead dataflow scheduling of fine-grained tasks, and streaming, overlapped processing of the control algorithms. We demonstrate that we can independently accelerate model evaluation by a mean factor of 6.5 × (1.4-23×) across a range of nonlinear device models and matrix solve by 2.4×(0.6-13×) across various benchmark matrices while delivering a mean combined speedup of 2.8×(0.2-11×) for the composite design when comparing a Xilinx Virtex-6 LX760 (40 nm) with an Intel Core i7 965 (45 nm). We also estimate mean energy savings of 8.9× (up to 40.9×) when comparing a Xilinx Virtex-6 LX760 with an Intel Core i7 965.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
© 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/TCAD.2011.2173199].