TY - GEN
T1 - Brief announcement
T2 - 24th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA'12
AU - Ballard, Grey
AU - Demmel, James
AU - Holtz, Olga
AU - Lipshitz, Benjamin
AU - Schwartz, Oded
PY - 2012
Y1 - 2012
N2 - A parallel algorithm has perfect strong scaling if its running time on P processors is linear in 1/P, including all communication costs. Distributed-memory parallel algorithms for matrix multiplication with perfect strong scaling have only recently been found. One is based on classical matrix multiplication (Solomonik and Demmel, 2011), and one is based on Strassen's fast matrix multiplication (Ballard, Demmel, Holtz, Lipshitz, and Schwartz, 2012). Both algorithms scale perfectly, but only up to some number of processors where the inter-processor communication no longer scales. We obtain a memory-independent communication cost lower bound on classical and Strassen-based distributed-memory matrix multiplication algorithms. These bounds imply that no classical or Strassen-based parallel matrix multiplication algorithm can strongly scale perfectly beyond the ranges already attained by the two parallel algorithms mentioned above. The memory-independent bounds and the strong scaling bounds generalize to other algorithms. Copyright is held by the author/owner(s).
AB - A parallel algorithm has perfect strong scaling if its running time on P processors is linear in 1/P, including all communication costs. Distributed-memory parallel algorithms for matrix multiplication with perfect strong scaling have only recently been found. One is based on classical matrix multiplication (Solomonik and Demmel, 2011), and one is based on Strassen's fast matrix multiplication (Ballard, Demmel, Holtz, Lipshitz, and Schwartz, 2012). Both algorithms scale perfectly, but only up to some number of processors where the inter-processor communication no longer scales. We obtain a memory-independent communication cost lower bound on classical and Strassen-based distributed-memory matrix multiplication algorithms. These bounds imply that no classical or Strassen-based parallel matrix multiplication algorithm can strongly scale perfectly beyond the ranges already attained by the two parallel algorithms mentioned above. The memory-independent bounds and the strong scaling bounds generalize to other algorithms. Copyright is held by the author/owner(s).
KW - Communication-avoiding algorithms
KW - Fast matrix multiplication
KW - Strong scaling
UR - http://www.scopus.com/inward/record.url?scp=84864146488&partnerID=8YFLogxK
U2 - 10.1145/2312005.2312021
DO - 10.1145/2312005.2312021
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84864146488
SN - 9781450312134
T3 - Annual ACM Symposium on Parallelism in Algorithms and Architectures
SP - 77
EP - 79
BT - SPAA'12 - Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures
Y2 - 25 June 2012 through 27 June 2012
ER -