On simulation and design of parallel-systems schedulers: Are we doing the right thing?

Edi Shmueli*, Dror G. Feitelson

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

34 Scopus citations

Abstract

It is customary to use open-system trace-driven simulations to evaluate the performance of parallel-system schedulers. As a consequence, all schedulers have evolved to optimize the packing of jobs in the schedule, as a means to improve a number of performance metrics that are conjectured to be correlated with user satisfaction, with the premise that this will result in a higher productivity in reality. We argue that these simulations suffer from severe limitations that lead to suboptimal scheduler designs and to even dismissing potentially good design alternatives. We propose an alternative simulation methodology called site-level simulation, in which the workload for the evaluation is generated dynamically by user models that interact with the system. We present a novel scheduler called CREASY that exploits knowledge on user behavior to directly improve user satisfaction and compare its performance to the original packing-based EASY scheduler. We show that user productivity improves by up to 50 percent under the user-aware design, while according to the conventional metrics, performance may actually degrade.

Original languageAmerican English
Pages (from-to)983-996
Number of pages14
JournalIEEE Transactions on Parallel and Distributed Systems
Volume20
Issue number7
DOIs
StatePublished - 2009

Bibliographical note

Funding Information:
This research was supported in part by the Israel Science Foundation (Grant 167/03). Many thanks are due to the people and organizations that deposited their workload logs in the Parallel Workloads Archive and made this research possible.

Keywords

  • Feedback
  • Open-system model
  • Parallel job scheduling
  • Trace-driven simulations
  • User behavior

Fingerprint

Dive into the research topics of 'On simulation and design of parallel-systems schedulers: Are we doing the right thing?'. Together they form a unique fingerprint.

Cite this