Abstract
Network topologies can have significant effect on the execution costs of parallel algorithms due to inter-processor communication. For particular combinations of computations and network topologies, costly network contention may inevitably become a bottleneck, even if algorithms are optimally designed so that each processor communicates as little as possible. We obtain novel contention lower bounds that are functions of the network and the computation graph parameters. For several combinations of fundamental computations and common network topologies, our new analysis improves upon previous per-processor lower bounds which only specify the number of words communicated by the busiest individual processor. We consider torus and mesh topologies, universal fat-trees, and hypercubes; algorithms covered include classical matrix multiplication and direct numerical linear algebra, fast matrix multiplication algorithms, programs that reference arrays, N-body computations, and the FFT. For example, we show that fast matrix multiplication algorithms (e.g., Strassen's) running on a 3D torus will suffer from contention bottlenecks. On the other hand, this network is likely sufficient for a classical matrix multiplication algorithm. Our new lower bounds are matched by existing algorithms only in very few cases, leaving many open problems for network and algorithmic design.
Original language | English |
---|---|
Title of host publication | Proceedings of COM-HPC 2016 |
Subtitle of host publication | 1st Workshop on Optimization of Communication in HPC Runtime Systems - Held in conjunction with SC 2016: The International Conference for High Performance Computing, Networking, Storage and Analysis |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 39-52 |
Number of pages | 14 |
ISBN (Electronic) | 9781509038299 |
DOIs | |
State | Published - 23 Jan 2017 |
Event | 1st Workshop on Optimization of Communication in HPC Runtime Systems, COM-HPC 2016 - Salt Lake City, United States Duration: 18 Nov 2016 → … |
Publication series
Name | Proceedings of COM-HPC 2016: 1st Workshop on Optimization of Communication in HPC Runtime Systems - Held in conjunction with SC 2016: The International Conference for High Performance Computing, Networking, Storage and Analysis |
---|
Conference
Conference | 1st Workshop on Optimization of Communication in HPC Runtime Systems, COM-HPC 2016 |
---|---|
Country/Territory | United States |
City | Salt Lake City |
Period | 18/11/16 → … |
Bibliographical note
Publisher Copyright:© 2016 IEEE.
Keywords
- Communication costs
- Communication-avoiding algorithms
- FFT
- Matrix Multiplication
- Network topology
- Numerical Linear Algebra
- Strong scaling