TY - GEN
T1 - Measuring bias in the mixing time of social graphs due to graph sampling
AU - Mohaisen, Abedelaziz
AU - Luo, Pengkui
AU - Li, Yanhua
AU - Kim, Yongdae
AU - Zhang, Zhi Li
PY - 2012/12/1
Y1 - 2012/12/1
N2 - Sampling of large social graphs is used for addressing infeasibility of measurements in large social graphs, or for crawling graphs from online social network services where accessing an entire social graph at once is often impossible. Sampling algorithms aim at maintaining certain properties of the original graphs in the sampled (or crawled) ones. Several sampling algorithms, such as breadth-first search, standard random walk, and Metropolis-Hastings random walk, among others, are widely used in the literature for sampling graphs. Some of these sampling algorithms are known for their bias, mainly towards high degree nodes, while bias for other metrics is not well-studied. In this paper we consider the bias of sampling algorithms on the mixing time. We quantitatively show that some existing sampling algorithms, even those which are unbiased to the degree distribution, always produce biased estimation of the mixing time of social graphs. We argue that bias in sampling algorithms accepted in the literature is rather metric-dependent, and a given sampling algorithm, while may work nicely and unbiased to one property, may produce considerable amount of bias in other properties.
AB - Sampling of large social graphs is used for addressing infeasibility of measurements in large social graphs, or for crawling graphs from online social network services where accessing an entire social graph at once is often impossible. Sampling algorithms aim at maintaining certain properties of the original graphs in the sampled (or crawled) ones. Several sampling algorithms, such as breadth-first search, standard random walk, and Metropolis-Hastings random walk, among others, are widely used in the literature for sampling graphs. Some of these sampling algorithms are known for their bias, mainly towards high degree nodes, while bias for other metrics is not well-studied. In this paper we consider the bias of sampling algorithms on the mixing time. We quantitatively show that some existing sampling algorithms, even those which are unbiased to the degree distribution, always produce biased estimation of the mixing time of social graphs. We argue that bias in sampling algorithms accepted in the literature is rather metric-dependent, and a given sampling algorithm, while may work nicely and unbiased to one property, may produce considerable amount of bias in other properties.
KW - Biased estimation
KW - Mixing Time
KW - Sampling
KW - Social graphs
UR - http://www.scopus.com/inward/record.url?scp=84874294472&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84874294472&partnerID=8YFLogxK
U2 - 10.1109/MILCOM.2012.6415714
DO - 10.1109/MILCOM.2012.6415714
M3 - Conference contribution
AN - SCOPUS:84874294472
SN - 9781467317290
T3 - Proceedings - IEEE Military Communications Conference MILCOM
BT - MILCOM 2012 - 2012 IEEE Military Communications Conference
T2 - 2012 IEEE Military Communications Conference, MILCOM 2012
Y2 - 1 November 2012 through 1 November 2012
ER -