Using multicast to pre-load jobs on the ParPar cluster

Avi Kavas, David Er-El, Dror G. Feitelson

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

The ParPar system is a high-performance cluster environment supporting a multiuser parallel workload. Its design follows a master-nodes structure, where the master controls all aspects of system activity using a dedicated control network. As nearly all control messages are multicast to a set of nodes, we implemented a reliable multicast protocol for this network based on UDP. This was then used to pre-load executable files to the nodes, rather than using demand paging via NFS. Such pre-loading leads to significant reductions in job startup times in most cases. It is also more scalable than an asymmetrical hardware approach giving the master higher bandwidth, which can be used for small clusters.

Original languageEnglish
Pages (from-to)315-327
Number of pages13
JournalParallel Computing
Volume27
Issue number3
DOIs
StatePublished - Feb 2001

Bibliographical note

Funding Information:
This research was supported in part by The Ministry of Science Basic Infrastructure Fund, Project 9762, and by The Israel Science Foundation founded by the Israel Academy of Sciences & Humanities. The LANL job log data was graciously provided by Curt Canada, who also helped with its interpretation. the LLNL job log data was graciously provided by Moe Jette, who also helped with background information and interpretation.

Fingerprint

Dive into the research topics of 'Using multicast to pre-load jobs on the ParPar cluster'. Together they form a unique fingerprint.

Cite this