Abstract : mapping streaming applications onto GPU systems.
Huynh, Huynh Phung.
Goh, Rick Siow Mong.
Date of Issue2012
SC Companion: High Performance Computing, Networking, Storage and Analysis (2012 : Salt Lake City, Utah, United States)
School of Computer Engineering
Parallel and Distributed Computing Centre
We describe an efficient and scalable code generation framework that automatically maps general purpose streaming applications onto GPU systems. This architecture-driven framework takes into account the idiosyncrasies of the GPU pipeline and the unique memory hierarchy. The framework has been implemented as a back-end to the StreamIt programming language compiler. Several key features in this framework ensure maximized performance and scalability. First, the generated code increases the effectiveness of the on-chip memory hierarchy by employing a heterogeneous mix of compute and memory access threads. Our scheme goes against the conventional wisdom of GPU programming which is to use a large number of homogeneous threads. Second, we utilise an efficient stream graph partitioning algorithm to handle larger applications and achieve the best performance under the given on-chip memory constraints. Lastly, the framework maps complex applications onto multiple GPUs using a highly effective pipeline execution scheme. Our comprehensive experiments show its scalability and significant speedup compared to a state-of-the-art solution.
DRNTU::Engineering::Computer science and engineering