TY - JOUR
T1 - MAPS
T2 - Optimizing massively parallel applications using device-level memory abstraction
AU - Rubin, Eri
AU - Levy, Ely
AU - Barak, Amnon
AU - Ben-Nun, Tal
PY - 2014/12/1
Y1 - 2014/12/1
N2 - GPUs play an increasingly important role in high-performance computing. While developing naive code is straightforward, optimizing massively parallel applications requires deep understanding of the underlying architecture. The developer must struggle with complex index calculations and manual memory transfers. This article classifies memory access patterns used in most parallel algorithms, based on Berkeley's Parallel "Dwarfs." It then proposes the MAPS framework, a device-level memory abstraction that facilitates memory access on GPUs, alleviating complex indexing using on-device containers and iterators. This article presents an implementation of MAPS and shows that its performance is comparable to carefully optimized implementations of real-world applications.
AB - GPUs play an increasingly important role in high-performance computing. While developing naive code is straightforward, optimizing massively parallel applications requires deep understanding of the underlying architecture. The developer must struggle with complex index calculations and manual memory transfers. This article classifies memory access patterns used in most parallel algorithms, based on Berkeley's Parallel "Dwarfs." It then proposes the MAPS framework, a device-level memory abstraction that facilitates memory access on GPUs, alleviating complex indexing using on-device containers and iterators. This article presents an implementation of MAPS and shows that its performance is comparable to carefully optimized implementations of real-world applications.
KW - GPGPU
KW - Heterogeneous computing architectures
KW - Memory abstraction
KW - Memory access patterns
UR - http://www.scopus.com/inward/record.url?scp=84917736767&partnerID=8YFLogxK
U2 - 10.1145/2680544
DO - 10.1145/2680544
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:84917736767
SN - 1544-3566
VL - 11
JO - ACM Transactions on Architecture and Code Optimization
JF - ACM Transactions on Architecture and Code Optimization
IS - 4
M1 - 44
ER -