TY - GEN
T1 - On extracting session data from activity logs
AU - Mehrzadi, David
AU - Feitelson, Dror G.
PY - 2012
Y1 - 2012
N2 - Activity logs from large-scale systems facilitate the study of user behavior, which can be used to improve and tune the user experience. However, the available data often lacks important elements such as the identification of user sessions. Previous work typically compensated for this by setting a threshold of around 30 minutes, and assuming that breaks in activity longer than the threshold reflect breaks between sessions. We show that using such a global threshold introduces artifacts that may affect the analysis, because there is a high probability that long sessions are not identified correctly. As an alternative, we suggest that a suitable individual threshold be found for each user, based on that user's activity pattern. Applying this approach to a large dataset from the AOL search engine leads to a distribution of session durations that is free of artifacts like those that appear when using a global threshold.
AB - Activity logs from large-scale systems facilitate the study of user behavior, which can be used to improve and tune the user experience. However, the available data often lacks important elements such as the identification of user sessions. Previous work typically compensated for this by setting a threshold of around 30 minutes, and assuming that breaks in activity longer than the threshold reflect breaks between sessions. We show that using such a global threshold introduces artifacts that may affect the analysis, because there is a high probability that long sessions are not identified correctly. As an alternative, we suggest that a suitable individual threshold be found for each user, based on that user's activity pattern. Applying this approach to a large dataset from the AOL search engine leads to a distribution of session durations that is free of artifacts like those that appear when using a global threshold.
KW - activity log
KW - session
KW - user behavior
UR - http://www.scopus.com/inward/record.url?scp=84867553352&partnerID=8YFLogxK
U2 - 10.1145/2367589.2367592
DO - 10.1145/2367589.2367592
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84867553352
SN - 9781450314480
T3 - ACM International Conference Proceeding Series
BT - Proceedings of the 5th Annual International Systems and Storage Conference, SYSTOR'12
T2 - 5th Annual International Systems and Storage Conference, SYSTOR 2012
Y2 - 4 June 2012 through 6 June 2012
ER -