GPGPU '16- Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit

Full Citation in the ACM Digital Library

SESSION: Algorithm

Runtime aware architectures

GPU centric extensions for parallel strongly connected components computation

General-purpose join algorithms for large graph triangle listing on heterogeneous systems

SESSION: Heterogenous languages, extensions and runtimes

Performance portable GPU code generation for matrix multiplication

Multi-stage programming for GPUs in C++ using PACXX

Simplifying programming and load balancing of data parallel applications on heterogeneous systems

SESSION: Tasking and scheduling

Working together to build the heterogeneous processing ecosystem

Implementing directed acyclic graphs with the heterogeneous system architecture

GPUpIO: the case for I/O-driven preemption on GPUs

SESSION: Stencil optimization

A systems perspective on GPU computing: a tribute to Karsten Schwan

Designing high performance communication runtime for GPU managed memory: early experiences

Effective resource management for enhancing performance of 2D and 3D stencils on GPUs