Instability in parallel job scheduling simulation: The role of workload flurries

Dan Tsafrir*, Dror G. Feitelson

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

33 Scopus citations

Abstract

The performance of computer systems depends, among other things, on the workload. This motivates the use of real workloads (as recorded in activity logs) to drive simulations of new designs. Unfortunately, real workloads may contain various anomalies that contaminate the data. A previously unrecognized type of anomaly is workload flurries: rare surges of activity with a repetitive nature, caused by a single user, that dominate the workload for a relatively short period. We find that long workloads often include at least one such event. We show that in the context of parallel job scheduling these events can have a significant effect on performance evaluation results, e.g. a very small perturbation of the simulation conditions might lead to a large and disproportional change in the outcome. This instability is due to jobs in the flurry being effected in unison, a consequence of the flurry's repetitive nature. We therefore advocate that flurries be filtered out before the workload is used, in order to achieve stable and more reliable evaluation results (analogously to the removal of outliers in statistical analysis). At the same time, we note that more research is needed on the possible effects of flurries.

Original languageAmerican English
Title of host publication20th International Parallel and Distributed Processing Symposium, IPDPS 2006
PublisherIEEE Computer Society
ISBN (Print)1424400546, 9781424400546
DOIs
StatePublished - 2006
Event20th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2006 - Rhodes Island, Greece
Duration: 25 Apr 200629 Apr 2006

Publication series

Name20th International Parallel and Distributed Processing Symposium, IPDPS 2006
Volume2006

Conference

Conference20th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2006
Country/TerritoryGreece
CityRhodes Island
Period25/04/0629/04/06

Fingerprint

Dive into the research topics of 'Instability in parallel job scheduling simulation: The role of workload flurries'. Together they form a unique fingerprint.

Cite this