TY - GEN
T1 - On identifying user session boundaries in parallel workload logs
AU - Zakay, Netanel
AU - Feitelson, Dror G.
PY - 2013
Y1 - 2013
N2 - The stream of jobs submitted to a parallel supercomputer is actually the interleaving of many streams from different users, each of which is composed of sessions. Identifying and characterizing the sessions is important in the context of workload modeling, especially if a user-based workload model is considered. Traditionally, sessions have been delimited by long think times, that is, by intervals of more than, say, 20 minutes from the termination of one job to the submittal of the next job. We show that such a definition is problematic in this context, because jobs may be extremely long. As a result of including each job's execution in the session, we may get unrealistically long sessions, and indeed, users most probably do not always stay connected and wait for the termination of long jobs. We therefore suggest that sessions be identified based on proven user activity, namely the submittal of new jobs, regardless of how long they run.
AB - The stream of jobs submitted to a parallel supercomputer is actually the interleaving of many streams from different users, each of which is composed of sessions. Identifying and characterizing the sessions is important in the context of workload modeling, especially if a user-based workload model is considered. Traditionally, sessions have been delimited by long think times, that is, by intervals of more than, say, 20 minutes from the termination of one job to the submittal of the next job. We show that such a definition is problematic in this context, because jobs may be extremely long. As a result of including each job's execution in the session, we may get unrealistically long sessions, and indeed, users most probably do not always stay connected and wait for the termination of long jobs. We therefore suggest that sessions be identified based on proven user activity, namely the submittal of new jobs, regardless of how long they run.
UR - http://www.scopus.com/inward/record.url?scp=84872584577&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-35867-8_12
DO - 10.1007/978-3-642-35867-8_12
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84872584577
SN - 9783642358661
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 216
EP - 234
BT - Job Scheduling Strategies for Parallel Processing - 16th International Workshop, JSSPP 2012, Revised Selected Papers
PB - Springer Verlag
T2 - 16th Workshop on Job Scheduling Strategies for Parallel Processing, JSSPP 2012
Y2 - 25 May 2012 through 25 May 2012
ER -