Fast matrix multiplication algorithms are of practical use only if the leading coefficient of their arithmetic complexity is sufficiently small. Many algorithms with low asymptotic cost have large leading coefficients, and are thus impractical. Karstadt and Schwartz have recently demonstrated a technique that reduces the leading coefficient by introducing fast O(n2 logn) basis transformations, applied to the input and output matrices. We generalize their technique, by allowing larger bases for the transformations while maintaining low overhead. Thus we accelerate several matrix multiplication algorithms, beyond what is known to be possible using the previous technique. Of particular interest are a few new sub-cubic algorithms with leading coefficient 2, matching that of classical matrix multiplication. For example, we obtain an algorithm with arithmetic complexity of 2nlog3 23+o(nlog3 23) compared to 2n3 − n2 of the classical algorithm. Such new algorithms can outperform previous ones (classical included) even on relatively small matrices. We obtain lower bounds matching the coefficient of several of our algorithms, proving them to be optimal.
|Original language||American English|
|Title of host publication||SPAA 2019 - Proceedings of the 31st ACM Symposium on Parallelism in Algorithms and Architectures|
|Publisher||Association for Computing Machinery|
|Number of pages||12|
|State||Published - 17 Jun 2019|
|Event||31st ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2019 - Phoenix, United States|
Duration: 22 Jun 2019 → 24 Jun 2019
|Name||Annual ACM Symposium on Parallelism in Algorithms and Architectures|
|Conference||31st ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2019|
|Period||22/06/19 → 24/06/19|
Bibliographical notePublisher Copyright:
© 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM.
- Bilinear Algorithms
- Fast Matrix Multiplication