On extracting session data from activity logs

David Mehrzadi*, Dror G. Feitelson

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations


Activity logs from large-scale systems facilitate the study of user behavior, which can be used to improve and tune the user experience. However, the available data often lacks important elements such as the identification of user sessions. Previous work typically compensated for this by setting a threshold of around 30 minutes, and assuming that breaks in activity longer than the threshold reflect breaks between sessions. We show that using such a global threshold introduces artifacts that may affect the analysis, because there is a high probability that long sessions are not identified correctly. As an alternative, we suggest that a suitable individual threshold be found for each user, based on that user's activity pattern. Applying this approach to a large dataset from the AOL search engine leads to a distribution of session durations that is free of artifacts like those that appear when using a global threshold.

Original languageAmerican English
Title of host publicationProceedings of the 5th Annual International Systems and Storage Conference, SYSTOR'12
StatePublished - 2012
Event5th Annual International Systems and Storage Conference, SYSTOR 2012 - Haifa, Israel
Duration: 4 Jun 20126 Jun 2012

Publication series

NameACM International Conference Proceeding Series


Conference5th Annual International Systems and Storage Conference, SYSTOR 2012


  • activity log
  • session
  • user behavior


Dive into the research topics of 'On extracting session data from activity logs'. Together they form a unique fingerprint.

Cite this