TY - JOUR
T1 - Node Embedding with Adaptive Similarities for Scalable Learning over Graphs
AU - Berberidis, Dimitris
AU - Giannakis, Georgios B.
N1 - Publisher Copyright:
© 1989-2012 IEEE.
PY - 2021/2/1
Y1 - 2021/2/1
N2 - Node embedding is the task of extracting informative and descriptive features over the nodes of a graph. The importance of node embedding for graph analytics as well as learning tasks, such as node classification, link prediction, and community detection, has led to a growing interest and a number of recent advances. Nonetheless, node embedding faces several major challenges. Practical embedding methods have to deal with real-world graphs that arise from different domains, with inherently diverse underlying processes as well as similarity structures and metrics. On the other hand, similar to principal component analysis in feature vector spaces, node embedding is an inherently unsupervised task. Lacking metadata for validation, practical schemes motivate standardization and limited use of tunable hyperparameters. Finally, node embedding methods must be scalable in order to cope with large-scale real-world graphs of networks with ever-increasing size. The present work puts forth an adaptive node embedding framework that adjusts the embedding process to a given underlying graph, in a fully unsupervised manner. This is achieved by leveraging the notion of a tunable node similarity matrix that assigns weights on multihop paths. The design of multihop similarities ensures that the resultant embeddings also inherit interpretable spectral properties. The proposed model is thoroughly investigated, interpreted, and numerically evaluated using stochastic block models. Moreover, an unsupervised algorithm is developed for training the model parameters effieciently. Extensive node classification, link prediction, and clustering experiments are carried out on many real-world graphs from various domains, along with comparisons with state-of-the-art scalable and unsupervised node embedding alternatives. The proposed method enjoys superior performance in many cases, while also yielding interpretable information on the underlying graph structure.
AB - Node embedding is the task of extracting informative and descriptive features over the nodes of a graph. The importance of node embedding for graph analytics as well as learning tasks, such as node classification, link prediction, and community detection, has led to a growing interest and a number of recent advances. Nonetheless, node embedding faces several major challenges. Practical embedding methods have to deal with real-world graphs that arise from different domains, with inherently diverse underlying processes as well as similarity structures and metrics. On the other hand, similar to principal component analysis in feature vector spaces, node embedding is an inherently unsupervised task. Lacking metadata for validation, practical schemes motivate standardization and limited use of tunable hyperparameters. Finally, node embedding methods must be scalable in order to cope with large-scale real-world graphs of networks with ever-increasing size. The present work puts forth an adaptive node embedding framework that adjusts the embedding process to a given underlying graph, in a fully unsupervised manner. This is achieved by leveraging the notion of a tunable node similarity matrix that assigns weights on multihop paths. The design of multihop similarities ensures that the resultant embeddings also inherit interpretable spectral properties. The proposed model is thoroughly investigated, interpreted, and numerically evaluated using stochastic block models. Moreover, an unsupervised algorithm is developed for training the model parameters effieciently. Extensive node classification, link prediction, and clustering experiments are carried out on many real-world graphs from various domains, along with comparisons with state-of-the-art scalable and unsupervised node embedding alternatives. The proposed method enjoys superior performance in many cases, while also yielding interpretable information on the underlying graph structure.
KW - SVD
KW - SVM
KW - multiscale
KW - random walks
KW - spectral
KW - unsupervised
UR - http://www.scopus.com/inward/record.url?scp=85099453284&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85099453284&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2019.2931542
DO - 10.1109/TKDE.2019.2931542
M3 - Article
AN - SCOPUS:85099453284
SN - 1041-4347
VL - 33
SP - 637
EP - 650
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 2
M1 - 8778744
ER -