Abstract
Data mining techniques can be used in several significant types of research in various domains to understand the flows of data, discover hidden knowledge, and improve quality of life. Sequential pattern mining, which is one data mining method, has emerged in the transportation domain to recognize the dynamic behavior of vehicles, trains, people...etc. The idea of sequential pattern mining is to find the frequent subsequences in a database of sequences. Although multiple sequential pattern mining techniques can mine sequential patterns, timestamps are only used to order sequences and time between sequence events is ignored. This information is important in real applications, such as traffic recommendation system and transportation safety. Though knowing that measurement Y occurs after measurement X is valuable, it is more valuable to know the estimated time before the appearance of measurement Y, perhaps, for example, to schedule maintenance at the right time in order to prevent railway damages. In this paper, we propose an algorithm called Minits (MINIng Timed Sequential patterns) to find the frequent sequential patterns and include the transition time between events in these patterns. However, approaches that depend on serial architecture are not effective anymore due to the massive data that are frequently generated from different sources. Therefore, we exploit parallelism using multicore CPUs to improve the performance of Minits to handle big data. Extensive experiments on real and synthetic datasets are reported and show the significance and advantages of this approach. Also, the execution time using a multicore outperform the single core when it deals with big data.
Original language | English (US) |
---|---|
Title of host publication | Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019 |
Editors | Chaitanya Baru, Jun Huan, Latifur Khan, Xiaohua Tony Hu, Ronay Ak, Yuanyuan Tian, Roger Barga, Carlo Zaniolo, Kisung Lee, Yanfang Fanny Ye |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 3573-3582 |
Number of pages | 10 |
ISBN (Electronic) | 9781728108582 |
DOIs | |
State | Published - Dec 2019 |
Event | 2019 IEEE International Conference on Big Data, Big Data 2019 - Los Angeles, United States Duration: Dec 9 2019 → Dec 12 2019 |
Publication series
Name | Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019 |
---|
Conference
Conference | 2019 IEEE International Conference on Big Data, Big Data 2019 |
---|---|
Country/Territory | United States |
City | Los Angeles |
Period | 12/9/19 → 12/12/19 |
Bibliographical note
Funding Information:This work was partially supported by the NSF grant #1302439 and the BHGE grant # 19-0069.
Publisher Copyright:
© 2019 IEEE.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
Keywords
- Sequential pattern mining
- Timed sequential patterns
- single and multicore