TY - JOUR
T1 - Root and community inference on the latent growth process of a network
AU - Li, Tianxi
N1 - Publisher Copyright:
Copyright © The Author(s) 2024.
PY - 2024/9/1
Y1 - 2024/9/1
N2 - Many statistical models for networks overlook the fact that most real-world networks are formed through a growth process. To address this, we introduce the Preferential Attachment Plus Erdős–Rényi model, where we let a random network G be the union of a preferential attachment (PA) tree T and additional Erdős-Rényi (ER) random edges. The PA tree captures the underlying growth process of a network where vertices/edges are added sequentially, while the ER component can be regarded as noise. Given only one snapshot of the final network G, we study the problem of constructing confidence sets for the root node of the unobserved growth process; the root node can be patient zero in an infection network or the source of fake news in a social network. We propose inference algorithms based on Gibbs sampling that scales to networks with millions of nodes and provide theoretical analysis showing that the size of the confidence set is small if the noise level of the ER edges is not too large. We also propose variations of the model in which multiple growth processes occur simultaneously, reflecting the growth of multiple communities; we use these models to provide a new approach to community detection.
AB - Many statistical models for networks overlook the fact that most real-world networks are formed through a growth process. To address this, we introduce the Preferential Attachment Plus Erdős–Rényi model, where we let a random network G be the union of a preferential attachment (PA) tree T and additional Erdős-Rényi (ER) random edges. The PA tree captures the underlying growth process of a network where vertices/edges are added sequentially, while the ER component can be regarded as noise. Given only one snapshot of the final network G, we study the problem of constructing confidence sets for the root node of the unobserved growth process; the root node can be patient zero in an infection network or the source of fake news in a social network. We propose inference algorithms based on Gibbs sampling that scales to networks with millions of nodes and provide theoretical analysis showing that the size of the confidence set is small if the noise level of the ER edges is not too large. We also propose variations of the model in which multiple growth processes occur simultaneously, reflecting the growth of multiple communities; we use these models to provide a new approach to community detection.
UR - https://www.scopus.com/pages/publications/85204789401
UR - https://www.scopus.com/pages/publications/85204789401#tab=citedBy
U2 - 10.1093/jrsssb/qkae046
DO - 10.1093/jrsssb/qkae046
M3 - Article
AN - SCOPUS:85204789401
SN - 1369-7412
VL - 86
SP - 880
EP - 881
JO - Journal of the Royal Statistical Society. Series B: Statistical Methodology
JF - Journal of the Royal Statistical Society. Series B: Statistical Methodology
IS - 4
ER -