TY - GEN
T1 - Scalable information flow mining in networks
AU - Subbian, Karthik
AU - Sridhar, Chidananda
AU - Aggarwal, Charu C.
AU - Srivastava, Jaideep
PY - 2014/1/1
Y1 - 2014/1/1
N2 - The problem of understanding user activities and their patterns of communication is extremely important in social and collaboration networks. This can be achieved by tracking the dominant content flow trends and their interactions between users in the network. Our approach tracks all possible paths of information flow using its network structure, content propagated and the time of propagation. We also show that the complexity class of this problem is #P-complete. Because most social networks have many activities and interactions, it is inevitable the proposed method will be computationally intensive. Therefore, we propose an efficient method for mining information flow patterns, especially in large networks, using distributed vertex-centric computational models. We use the Gather-Apply-Scatter (GAS) paradigm to implement our approach. We experimentally show that our approach achieves over three orders of magnitude advantage over the state-of-the-art, with an increasing advantage with a greater number of cores. We also study the effectiveness of the discovered content flow patterns by using it in the context of an influence analysis application.
AB - The problem of understanding user activities and their patterns of communication is extremely important in social and collaboration networks. This can be achieved by tracking the dominant content flow trends and their interactions between users in the network. Our approach tracks all possible paths of information flow using its network structure, content propagated and the time of propagation. We also show that the complexity class of this problem is #P-complete. Because most social networks have many activities and interactions, it is inevitable the proposed method will be computationally intensive. Therefore, we propose an efficient method for mining information flow patterns, especially in large networks, using distributed vertex-centric computational models. We use the Gather-Apply-Scatter (GAS) paradigm to implement our approach. We experimentally show that our approach achieves over three orders of magnitude advantage over the state-of-the-art, with an increasing advantage with a greater number of cores. We also study the effectiveness of the discovered content flow patterns by using it in the context of an influence analysis application.
KW - Influence Analysis Network-centric approach
KW - Information Flow Mining
KW - Scalable Influence Analysis
KW - Vertex-centric models
UR - http://www.scopus.com/inward/record.url?scp=84907012505&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84907012505&partnerID=8YFLogxK
U2 - 10.1007/978-3-662-44845-8_9
DO - 10.1007/978-3-662-44845-8_9
M3 - Conference contribution
AN - SCOPUS:84907012505
SN - 9783662448441
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 130
EP - 146
BT - Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2014, Proceedings
PB - Springer- Verlag
T2 - European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2014
Y2 - 15 September 2014 through 19 September 2014
ER -