Abstract
Numerous important applications rely on detailed trajectory data. Yet, unfortunately, trajectory datasets are typically sparse with large spatial and temporal gaps between each two points, which is a major hurdle for their accuracy. This paper presents KAmel; a scalable trajectory imputation system that inserts additional realistic trajectory points, boosting the accuracy of trajectory applications. KAmel maps the trajectory imputation problem to finding the missing word problem; a classical problem in the natural language processing (NLP) community. This allows employing the widely used BERT model for trajectory imputation. However, BERT, as is, does not lend itself to the special characteristics of trajectories. Hence, KAmel starts from BERT, but then adds spatial-awareness to its operations, adjusts trajectory data to be closer to the nature of language data, and adds multipoint imputation ability to it; all encapsulated in one system. Experimental results based on real datasets show that KAmel significantly outperforms its competitors and is applicable to city-scale trajectories, large gaps, and tight accuracy thresholds.
Original language | English (US) |
---|---|
Pages (from-to) | 523-538 |
Number of pages | 16 |
Journal | Proceedings of the VLDB Endowment |
Volume | 17 |
Issue number | 3 |
DOIs | |
State | Published - 2023 |
Event | 50th International Conference on Very Large Data Bases, VLDB 2024 - Guangzhou, China Duration: Aug 24 2024 → Aug 29 2024 |
Bibliographical note
Publisher Copyright:© 2023, VLDB Endowment. All rights reserved.