Multidimensional visualization of transit smartcard data using space–time plots and data cubes

Ying Song, Yingling Fan, Xin Li, Yanjie Ji

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Given the wide application of automatic fare collection systems in transit systems across the globe, smartcard data with on- and/or off-boarding information has become a new source of data to understand passenger flow patterns. This paper uses Nanjing, China as a case study and examines the possibility of using the data cube technique in data mining to understand space–time travel patterns of Nanjing rail transit users. One month of smartcard data in October, 2013 was obtained from Nanjing rail transit system, with a total of over 22 million transaction records. We define the original data cube for the smartcard data based on four dimensions—Space, Date, Time, and User, design a hierarchy for each dimension, and use the total number of transactions as the quantitative measure. We develop modules using the programming language Python and share them as open-source on GitHub to enable peer production and advancement in the field. The visualizations of two-dimensional slices of the data cube show some interesting patterns such as different travel behaviors across user groups (e.g. students vs. elders), and irregular peak hours during National Holiday (October 1st–7th) compared to regular morning and afternoon peak hours during regular working weeks. Spatially, multidimensional visualizations show concentrations of various activity opportunities near metro rail stations and the changing popularities of rail stations through time accordingly. These findings support the feasibility and efficiency of the data cube technique as a mean of visual exploratory analysis for massive smart-card data, and can contribute to the evaluation and planning of public transit systems.

Original languageEnglish (US)
Pages (from-to)311-333
Number of pages23
JournalTransportation
Volume45
Issue number2
DOIs
StatePublished - Mar 1 2018

Fingerprint

visualization
Rails
Visualization
Smart cards
Flow patterns
Computer programming languages
Data mining
transaction
Students
date (time)
Planning
working week
visual analysis
travel behavior
data mining
programming language
holiday
flow pattern
popularity
student

Keywords

  • Data cube
  • Exploratory data mining
  • Smartcard data
  • Space–time plot
  • Transit
  • Travel behavior

Cite this

Multidimensional visualization of transit smartcard data using space–time plots and data cubes. / Song, Ying; Fan, Yingling; Li, Xin; Ji, Yanjie.

In: Transportation, Vol. 45, No. 2, 01.03.2018, p. 311-333.

Research output: Contribution to journalArticle

@article{f5300e7d5a33437f97725e8dcca68644,
title = "Multidimensional visualization of transit smartcard data using space–time plots and data cubes",
abstract = "Given the wide application of automatic fare collection systems in transit systems across the globe, smartcard data with on- and/or off-boarding information has become a new source of data to understand passenger flow patterns. This paper uses Nanjing, China as a case study and examines the possibility of using the data cube technique in data mining to understand space–time travel patterns of Nanjing rail transit users. One month of smartcard data in October, 2013 was obtained from Nanjing rail transit system, with a total of over 22 million transaction records. We define the original data cube for the smartcard data based on four dimensions—Space, Date, Time, and User, design a hierarchy for each dimension, and use the total number of transactions as the quantitative measure. We develop modules using the programming language Python and share them as open-source on GitHub to enable peer production and advancement in the field. The visualizations of two-dimensional slices of the data cube show some interesting patterns such as different travel behaviors across user groups (e.g. students vs. elders), and irregular peak hours during National Holiday (October 1st–7th) compared to regular morning and afternoon peak hours during regular working weeks. Spatially, multidimensional visualizations show concentrations of various activity opportunities near metro rail stations and the changing popularities of rail stations through time accordingly. These findings support the feasibility and efficiency of the data cube technique as a mean of visual exploratory analysis for massive smart-card data, and can contribute to the evaluation and planning of public transit systems.",
keywords = "Data cube, Exploratory data mining, Smartcard data, Space–time plot, Transit, Travel behavior",
author = "Ying Song and Yingling Fan and Xin Li and Yanjie Ji",
year = "2018",
month = "3",
day = "1",
doi = "10.1007/s11116-017-9790-2",
language = "English (US)",
volume = "45",
pages = "311--333",
journal = "Transportation",
issn = "0049-4488",
publisher = "Springer Netherlands",
number = "2",

}

TY - JOUR

T1 - Multidimensional visualization of transit smartcard data using space–time plots and data cubes

AU - Song, Ying

AU - Fan, Yingling

AU - Li, Xin

AU - Ji, Yanjie

PY - 2018/3/1

Y1 - 2018/3/1

N2 - Given the wide application of automatic fare collection systems in transit systems across the globe, smartcard data with on- and/or off-boarding information has become a new source of data to understand passenger flow patterns. This paper uses Nanjing, China as a case study and examines the possibility of using the data cube technique in data mining to understand space–time travel patterns of Nanjing rail transit users. One month of smartcard data in October, 2013 was obtained from Nanjing rail transit system, with a total of over 22 million transaction records. We define the original data cube for the smartcard data based on four dimensions—Space, Date, Time, and User, design a hierarchy for each dimension, and use the total number of transactions as the quantitative measure. We develop modules using the programming language Python and share them as open-source on GitHub to enable peer production and advancement in the field. The visualizations of two-dimensional slices of the data cube show some interesting patterns such as different travel behaviors across user groups (e.g. students vs. elders), and irregular peak hours during National Holiday (October 1st–7th) compared to regular morning and afternoon peak hours during regular working weeks. Spatially, multidimensional visualizations show concentrations of various activity opportunities near metro rail stations and the changing popularities of rail stations through time accordingly. These findings support the feasibility and efficiency of the data cube technique as a mean of visual exploratory analysis for massive smart-card data, and can contribute to the evaluation and planning of public transit systems.

AB - Given the wide application of automatic fare collection systems in transit systems across the globe, smartcard data with on- and/or off-boarding information has become a new source of data to understand passenger flow patterns. This paper uses Nanjing, China as a case study and examines the possibility of using the data cube technique in data mining to understand space–time travel patterns of Nanjing rail transit users. One month of smartcard data in October, 2013 was obtained from Nanjing rail transit system, with a total of over 22 million transaction records. We define the original data cube for the smartcard data based on four dimensions—Space, Date, Time, and User, design a hierarchy for each dimension, and use the total number of transactions as the quantitative measure. We develop modules using the programming language Python and share them as open-source on GitHub to enable peer production and advancement in the field. The visualizations of two-dimensional slices of the data cube show some interesting patterns such as different travel behaviors across user groups (e.g. students vs. elders), and irregular peak hours during National Holiday (October 1st–7th) compared to regular morning and afternoon peak hours during regular working weeks. Spatially, multidimensional visualizations show concentrations of various activity opportunities near metro rail stations and the changing popularities of rail stations through time accordingly. These findings support the feasibility and efficiency of the data cube technique as a mean of visual exploratory analysis for massive smart-card data, and can contribute to the evaluation and planning of public transit systems.

KW - Data cube

KW - Exploratory data mining

KW - Smartcard data

KW - Space–time plot

KW - Transit

KW - Travel behavior

UR - http://www.scopus.com/inward/record.url?scp=85020521022&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85020521022&partnerID=8YFLogxK

U2 - 10.1007/s11116-017-9790-2

DO - 10.1007/s11116-017-9790-2

M3 - Article

VL - 45

SP - 311

EP - 333

JO - Transportation

JF - Transportation

SN - 0049-4488

IS - 2

ER -