Improving and stabilizing parallel computer performance using adaptive backfilling

David Talby*, Dror G. Feitelson

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

26 Scopus citations

Abstract

The scheduler is a key component in determining the overall performance of a parallel computer, and as we show here, the schedulers in wide use today exhibit large unexplained gaps in performance during their operation. Also, different scheduling algorithms often vary in the gaps they show, suggesting that choosing the correct scheduler for each time frame can improve overall performance. We present two adaptive algorithms that achieve this: One chooses by recent past performance, and the other by the recent average degree of parallelism, which is shown to be correlated to algorithmic superiority. Simulation results for the algorithms on production workloads are analyzed, and illustrate unique features of the chaotic temporal structure of parallel workloads. We provide best parameter configurations for each algorithm, which both achieve average improvements of 10% in performance and 35% in stability for the tested workloads.

Original languageEnglish
Title of host publicationProceedings - 19th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2005
Pages84a
DOIs
StatePublished - 2005
Event19th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2005 - Denver, CO, United States
Duration: 4 Apr 20058 Apr 2005

Publication series

NameProceedings - 19th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2005
Volume2005

Conference

Conference19th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2005
Country/TerritoryUnited States
CityDenver, CO
Period4/04/058/04/05

Fingerprint

Dive into the research topics of 'Improving and stabilizing parallel computer performance using adaptive backfilling'. Together they form a unique fingerprint.

Cite this