Abstract
Traditional query answering returns all answers T to a given query. When T is large, the user may be interested in viewing only a smaller subset S of T. Previous work has focused on finding subsets S that are diverse, i.e., such that all items s,s' in S are very different one from another. This paper focuses on a complementary problem, namely finding subsets that are highly representative of the entire set of query results. Intuitively, a representative subset S is similar, in values and proportionality, to the entire set T. Finding such a representative set is challenging, both conceptually, and in practice. This paper proposes a novel method of choosing a representative subset, called SimSTV, which draws inspiration from the field of voting theory. An efficient algorithm is presented, which overcomes and leverages the many differences between choosing answers in a database, and voting in a real-life election. We also provide extensions to our algorithm, e.g., to accommodate affirmative action. Experimental results show the effectiveness of our algorithm.
Original language | English |
---|---|
Title of host publication | SIGMOD 2022 - Proceedings of the 2022 International Conference on Management of Data |
Publisher | Association for Computing Machinery |
Pages | 1741-1754 |
Number of pages | 14 |
ISBN (Electronic) | 9781450392495 |
DOIs | |
State | Published - 10 Jun 2022 |
Event | 2022 ACM SIGMOD International Conference on the Management of Data, SIGMOD 2022 - Virtual, Online, United States Duration: 12 Jun 2022 → 17 Jun 2022 |
Publication series
Name | Proceedings of the ACM SIGMOD International Conference on Management of Data |
---|---|
ISSN (Print) | 0730-8078 |
Conference
Conference | 2022 ACM SIGMOD International Conference on the Management of Data, SIGMOD 2022 |
---|---|
Country/Territory | United States |
City | Virtual, Online |
Period | 12/06/22 → 17/06/22 |
Bibliographical note
Publisher Copyright:© 2022 ACM.
Keywords
- diversity
- query answering
- representatives