Communication-optimal parallel algorithm for strassen's matrix multiplication

Grey Ballard*, James Demmel, Olga Holtz, Benjamin Lipshitz, Oded Schwartz

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

90 Scopus citations

Abstract

Parallel matrix multiplication is one of the most studied fundamental problems in distributed and high performance computing. We obtain a new parallel algorithm that is based on Strassen's fast matrix multiplication and minimizes communication. The algorithm outperforms all known parallel matrix multiplication algorithms, classical and Strassen-based, both asymptotically and in practice. A critical bottleneck in parallelizing Strassen's algorithm is the communication between the processors. Ballard, Demmel, Holtz, and Schwartz (SPAA '11) prove lower bounds on these communication costs, using expansion properties of the underlying computation graph. Our algorithm matches these lower bounds, and so is communication-optimal. It exhibits perfect strong scaling within the maximum possible range. Benchmarking our implementation on a Cray XT4, we obtain speedups over classical and Strassen-based algorithms ranging from 24% to 184% for a fixed matrix dimension n = 94080, where the number of processors ranges from 49 to 7203. Our parallelization approach generalizes to other fast matrix multiplication algorithms.

Original languageAmerican English
Title of host publicationSPAA'12 - Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures
Pages193-204
Number of pages12
DOIs
StatePublished - 2012
Externally publishedYes
Event24th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA'12 - Pittsburgh, PA, United States
Duration: 25 Jun 201227 Jun 2012

Publication series

NameAnnual ACM Symposium on Parallelism in Algorithms and Architectures

Conference

Conference24th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA'12
Country/TerritoryUnited States
CityPittsburgh, PA
Period25/06/1227/06/12

Keywords

  • Communication-avoiding algorithms
  • Fast matrix multiplication
  • Parallel algorithms

Fingerprint

Dive into the research topics of 'Communication-optimal parallel algorithm for strassen's matrix multiplication'. Together they form a unique fingerprint.

Cite this