Abstract
Matrix multiplication is one of the most extensively used kernels in scientific computing. Although subcubic algorithms exist, most high performance implementations are based on the classical \Theta(n3) matrix multiplication. Designing an algorithm that obtains even modest improvements in performance over existing implementations, requires carefully addressing challenges such as reducing computation costs, communication costs, and memory footprint. We provide the first high performance general matrix-matrix multiplication that utilizes the alternative basis method on Strassen's algorithm. We reduce the basis transformation overheads and decrease the memory footprint of the bilinear phase by using the pebbling game optimization scheme, consequentially improving both arithmetic and communication costs. Our algorithm outperforms DGEMM on feasible matrix dimensions starting at n = 96. It obtains an increasing speedup of up to nearly \times2 speedup for larger matrix dimensions when running sequentially, and even larger speedups for certain matrix dimensions when running in parallel.
Original language | English |
---|---|
Pages (from-to) | 277-303 |
Number of pages | 27 |
Journal | SIAM Journal on Scientific Computing |
Volume | 45 |
Issue number | 6 |
DOIs | |
State | Published - 2023 |
Bibliographical note
Publisher Copyright:© 2023 Oded Schwartz and Noa Vaknin.
Keywords
- alternative basis method
- bilinear algorithm
- fast matrix multiplication
- pebbling game