TY - GEN
T1 - Workload sanitation for performance evaluation
AU - Feitelson, Dror G.
AU - Tsafrir, Dan
PY - 2006
Y1 - 2006
N2 - The performance of computer systems depends, among other things, on the workload. Performance evaluations are therefore often done using logs of workloads on current productions systems, under the assumption that such real workloads are representative and reliable; likewise, workload modeling is typically based on real workloads. We show, however, that real workloads may also contain anomalies that make them non-representative and unreliable. This is a special case of multi-class workloads, where one class is the "real" workload which we wish to use in the evaluation, and the other class contaminates the log with "bogus" data. We provide several examples of this situation, including a previously unrecognized type of anomaly we call "workload flurries": surges of activity with a repetitive nature, caused by a single user, that dominate the workload for a relatively short period. Using a workload with such anomalies in effect emphasizes rare and unique events (e.g. occurring for a few days out of two years of logged data), and risks optimizing the design decision for the anomalous workload at the expense of the normal workload. Thus we claim that such anomalies should be removed from the workload before it is used in evaluations, and that ignoring them is actually an unjustifiable approach.
AB - The performance of computer systems depends, among other things, on the workload. Performance evaluations are therefore often done using logs of workloads on current productions systems, under the assumption that such real workloads are representative and reliable; likewise, workload modeling is typically based on real workloads. We show, however, that real workloads may also contain anomalies that make them non-representative and unreliable. This is a special case of multi-class workloads, where one class is the "real" workload which we wish to use in the evaluation, and the other class contaminates the log with "bogus" data. We provide several examples of this situation, including a previously unrecognized type of anomaly we call "workload flurries": surges of activity with a repetitive nature, caused by a single user, that dominate the workload for a relatively short period. Using a workload with such anomalies in effect emphasizes rare and unique events (e.g. occurring for a few days out of two years of logged data), and risks optimizing the design decision for the anomalous workload at the expense of the normal workload. Thus we claim that such anomalies should be removed from the workload before it is used in evaluations, and that ignoring them is actually an unjustifiable approach.
UR - http://www.scopus.com/inward/record.url?scp=33750843427&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:33750843427
SN - 1424401860
SN - 9781424401864
T3 - ISPASS 2006: IEEE International Symposium on Performance Analysis of Systems and Software, 2006
SP - 221
EP - 230
BT - ISPASS 2006
T2 - ISPASS 2006: IEEE International Symposium on Performance Analysis of Systems and Software, 2006
Y2 - 19 March 2006 through 21 March 2006
ER -