Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/81207
Title: Fanout decomposition dataflow optimizations for FPGA-based Sparse LU factorization
Authors: Siddhartha
Kapre, Nachiket
Keywords: Computer Science and Engineering
Issue Date: 2014
Source: Siddhartha,, & Kapre, N. (2014). Fanout decomposition dataflow optimizations for FPGA-based Sparse LU factorization. 2014 International Conference on Field-Programmable Technology (FPT), 252-255.
metadata.dc.contributor.conference: 2014 International Conference on Field-Programmable Technology (FPT)
Abstract: Performance of FPGA-based token dataflow architectures is often limited by the long tail distribution of parallelism in the compute paths of the dataflow graphs. This is known to limit speedup of dataflow processing of Sparse LU factorization to only 3-10x over CPUs. One reason behind the limitations is the serialization penalty of processing high-fanout nodes in the dataflow graph on traditional dataflow processing architectures. In this paper, we show how to perform one-time static fanout decomposition and selective node replication transformations to input dataflow graphs. These transformations are one-time static compute costs that are typically amortized over millions of iterations. For dataflow graphs extracted for sparse LU factorization, we demonstrate up to 2.3x speedup (1.2x geomean average) with this technique across a range of benchmark problems.
URI: https://hdl.handle.net/10356/81207
http://hdl.handle.net/10220/39179
DOI: 10.1109/FPT.2014.7082787
Schools: School of Computer Engineering 
Rights: © 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/FPT.2014.7082787].
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Conference Papers

Files in This Item:
File Description SizeFormat 
Fanout decomposition dataflow optimizations for FPGA-based Sparse LU factorization.pdf395.07 kBAdobe PDFThumbnail
View/Open

SCOPUSTM   
Citations 50

1
Updated on Sep 29, 2023

Page view(s)

366
Updated on Oct 1, 2023

Download(s) 50

144
Updated on Oct 1, 2023

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.