Abstract
Assigning semantically relevant, real-world locations to documents opens new possibilities to perform geographic information retrieval. We propose a novel approach to automatically determine the latitude-longitude coordinates of appropriate Wikipedia articles with high accuracy, leveraging both text and metadata in the corpus. By examining articles whose base-truth coordinates are known, we show that our method attains a substantial improvement over state of the art works. We subsequently demonstrate how our approach could yield two benefits: (1) detecting significant geolocation errors in Wikipedia; and (2) proposing approximated coordinates for hundreds of thousands of articles which are not traditionally considered to be locations (such as events, ideas or people), opening new possibilities for conceptual geographic retrievals over Wikipedia.
Original language | English |
---|---|
Title of host publication | ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023 |
Publisher | Association for Computing Machinery, Inc |
Pages | 3331-3341 |
Number of pages | 11 |
ISBN (Electronic) | 9781450394161 |
DOIs | |
State | Published - 30 Apr 2023 |
Event | 2023 World Wide Web Conference, WWW 2023 - Austin, United States Duration: 30 Apr 2023 → 4 May 2023 |
Publication series
Name | ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023 |
---|
Conference
Conference | 2023 World Wide Web Conference, WWW 2023 |
---|---|
Country/Territory | United States |
City | Austin |
Period | 30/04/23 → 4/05/23 |
Bibliographical note
Publisher Copyright:© 2023 Owner/Author.
Keywords
- Geographic Information Retrieval
- Geolocation
- Wikipedia