It is well-known that online services resort to various cookies to track users through users' online service identifiers (IDs) - in other words, when users access online services, various 'fingerprints' are left behind in the cyberspace. As they roam around in the physical world while accessing online services via mobile devices, users also leave a series of 'footprints' - i.e., hints about their physical locations - in the physical world. This poses a potent new threat to user privacy: one can potentially correlate the 'fingerprints' left by the users in the cyberspace with 'footprints' left in the physical world to infer and reveal leakage of user physical world privacy, such as frequent user locations or mobility trajectories in the physical world - we refer to this problem as user physical world privacy leakage via user cyberspace privacy leakage. In this paper we address the following fundamental question: what kind - and how much - of user physical world privacy might be leaked if we could get hold of such diverse network datasets even without any physical location information. In order to conduct an in-depth investigation of these questions, we utilize the network data collected via a DPI system at the routers within one of the largest Internet operator in Shanghai, China over a duration of one month. We decompose the fundamental question into the three problems: i) linkage of various online user IDs belonging to the same person via mobility pattern mining; ii) physical location classification via aggregate user mobility patterns over time; and iii) tracking user physical mobility. By developing novel and effective methods for solving each of these problems, we demonstrate that the question of user physical world privacy leakage via user cyberspace privacy leakage is not hypothetical, but indeed poses a real potent threat to user privacy.
|Original language||English (US)|
|Number of pages||17|
|Journal||IEEE Transactions on Network and Service Management|
|State||Published - Dec 2020|
Bibliographical noteFunding Information:
Manuscript received March 29, 2019; revised September 4, 2019 and February 26, 2020; accepted April 25, 2020. Date of publication July 31, 2020; date of current version December 9, 2020. This work was supported in part by The National Key Research and Development Program of China under grant 2018YFB1800804, the National Nature Science Foundation of China under U1936217, 61971267, 61972223, 61941117, 61861136003, Beijing Natural Science Foundation under L182038, Beijing National Research Center for Information Science and Technology under 20031887521, and research fund of Tsinghua University - Tencent Joint Laboratory for Internet Innovation Technology. The associate editor coordinating the review of this article and approving it for publication was M. Conti. (Corresponding author: Yong Li.) Huandong Wang, Chen Gao, Yong Li, and Depeng Jin are with the Beijing National Research Center for Information Science and Technology, Department of Electronic Engineering, Tsinghua University, Beijing 100084, China (e-mail: firstname.lastname@example.org).
© 2004-2012 IEEE.
- identity linkage
- location classification
- spatio-temporal trajectories