Turning down the noise in the blogosphere

Khalid El-Arini*, Gaurav Veda, Dafna Shahaf, Carlos Guestrin

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

107 Scopus citations

Abstract

In recent years, the blogosphere has experienced a substantial increase in the number of posts published daily, forcing users to cope with information overload. The task of guiding users through this flood of information has thus become critical. To address this issue, we present a principled approach for picking a set of posts that best covers the important stories in the blogosphere. We define a simple and elegant notion of coverage and formalize it as a submodular optimization problem, for which we can efficiently compute a near-optimal solution. In addition, since people have varied interests, the ideal coverage algorithm should incorporate user preferences in order to tailor the selected posts to individual tastes. We define the problem of learning a personalized coverage function by providing an appropriate user-interaction model and formalizing an online learning framework for this task. We then provide a no-regret algorithm which can quickly learn a user's preferences from limited feedback. We evaluate our coverage and personalization algorithms extensively over real blog data. Results from a user study show that our simple coverage algorithm does as well as most popular blog aggregation sites, including Google Blog Search, Yahoo! Buzz, and Digg. Furthermore, we demonstrate empirically that our algorithm can successfully adapt to user preferences. We believe that our technique, especially with personalization, can dramatically reduce information overload.

Original languageEnglish
Title of host publicationKDD '09
Subtitle of host publicationProceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Pages289-297
Number of pages9
DOIs
StatePublished - 2009
Externally publishedYes
Event15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '09 - Paris, France
Duration: 28 Jun 20091 Jul 2009

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Conference

Conference15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '09
Country/TerritoryFrance
CityParis
Period28/06/091/07/09

Keywords

  • Algorithms
  • Experimentation

Fingerprint

Dive into the research topics of 'Turning down the noise in the blogosphere'. Together they form a unique fingerprint.

Cite this