Abstract
This article investigates the problem of geosocial similarity among users of online social networks, based on the locations of their activities (e.g., posting messages or photographs). Finding pairs of geosocially similar users or detecting that two sets of locations (of activities) belong to the same user has important applications in privacy protection, recommendation systems, urban planning, and public health, among others. It is explained and shown empirically that common distance measures between sets of locations are inadequate for determining geosocial similarity. Two novel distance measures between sets of locations are introduced. One is the mutually nearest distance that is based on computing a matching between two sets. The second measure uses a quad-tree index. It is highly scalable but incurs the overhead of creating and maintaining the index. Algorithms with optimization techniques are developed for computing the two distance measures and also for finding the k-most-similar users of a given one. Extensive experiments, using geotagged messages from Twitter, show that the new distance measures are both more accurate and more efficient than existing ones.
Original language | English |
---|---|
Article number | 17 |
Journal | ACM Transactions on the Web |
Volume | 11 |
Issue number | 3 |
DOIs | |
State | Published - Jul 2017 |
Bibliographical note
Publisher Copyright:© 2017 ACM.
Keywords
- Earth mover's distance
- Geosocial networks
- Geosocial similarity
- Geospatial similarity
- Geotagged posts
- Hausdorff distance
- Set distance
- Social media
- Sociospatial analysis