Exploiting multiple levels of parallelism in sparse matrix-matrix multiplication

Ariful Azad, Grey Ballard, Aydin Buluç, James Demmel, Laura Grigori, Oded Schwartz, Sivan Toledo, Samuel Williams

Research output: Contribution to journalArticlepeer-review

67 Scopus citations


Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high-performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. The scaling of existing parallel implementations of SpGEMM is heavily bound by communication. Even though 3D (or 2.5D) algorithms have been proposed and theoretically analyzed in the at MPI model on Erd}os{Rffenyi matrices, those algorithms had not been implemented in practice and their complexities had not been analyzed for the general case. In this work, we present the first implementation of the 3D SpGEMM formulation that exploits multiple (intranode and internode) levels of parallelism, achieving significant speedups over the state-of-the-art publicly available codes at all levels of concurrencies. We extensively evaluate our implementation and identify bottlenecks that should be subject to further research.

Original languageAmerican English
Pages (from-to)C624-C651
JournalSIAM Journal on Scientific Computing
Issue number6
StatePublished - 2016

Bibliographical note

Publisher Copyright:
© 2016 Society for Industrial and Applied Mathematics.


  • 2.5D algorithms
  • 2D decomposition
  • 3D algorithms
  • Graph algorithms
  • Multithreading
  • Numerical linear algebra
  • Parallel computing
  • SpGEMM
  • Sparse matrix-matrix multiplication


Dive into the research topics of 'Exploiting multiple levels of parallelism in sparse matrix-matrix multiplication'. Together they form a unique fingerprint.

Cite this