RDMA-Based Library for Collective Operations in MPI

Alexander Margolin, Amnon Barak

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

In most MPI implementations, abstraction layers separate the collective operation algorithms from the communication primitives, thus hindering its optimization with network acceleration technologies, such as RDMA. Open UCX is an RDMA-based point-ot-point communication library, that can reduce the latency between processes in MPI applications, particularly in large-scale system. This paper presents a design and implementation of a library for MPI collective operations, by extending Open UCX. Our approach is transparent to MPI applications, and can reduce the latency of repeated calls to such operations by an average of 8% for relatively small message sizes and as much as 90% for larger messages.

Original languageEnglish
Title of host publicationProceedings of ExaMPI 2019
Subtitle of host publicationWorkshop on Exascale MPI - Held in conjunction with SC 2019: The International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages39-46
Number of pages8
ISBN (Electronic)9781728160092
DOIs
StatePublished - Nov 2019
Event2019 IEEE/ACM Workshop on Exascale MPI, ExaMPI 2019 - Denver, United States
Duration: 17 Nov 2019 → …

Publication series

NameProceedings of ExaMPI 2019: Workshop on Exascale MPI - Held in conjunction with SC 2019: The International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference2019 IEEE/ACM Workshop on Exascale MPI, ExaMPI 2019
Country/TerritoryUnited States
CityDenver
Period17/11/19 → …

Bibliographical note

Publisher Copyright:
© 2019 IEEE.

Fingerprint

Dive into the research topics of 'RDMA-Based Library for Collective Operations in MPI'. Together they form a unique fingerprint.

Cite this