Position-based Hash Embeddings for Scaling Graph Neural Networks

Maria Kalantzi, George Karypis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations


Graph Neural Networks (GNNs) bring the power of deep representation learning to graph and relational data and achieve state-of-the-art performance in many applications. GNNs compute node representations by taking into account the topology of the node's ego-network and the features of the ego-network's nodes. When the nodes do not have high-quality features, GNNs learn an embedding layer to compute node embeddings and use them as input features. However, the size of the embedding layer is linear to the product of the number of nodes in the graph and the dimensionality of the embedding and does not scale to big data and graphs with hundreds of millions of nodes. To reduce the memory associated with this embedding layer, hashing-based approaches, commonly used in applications like NLP and recommender systems, can potentially be used. However, a direct application of these ideas fails to exploit the fact that in many real-world graphs, nodes that are topologically close will tend to be related to each other (homophily) and as such their representations will be similar.In this work, we present approaches that take advantage of the nodes' position in the graph to dramatically reduce the memory required, with minimal if any degradation in the quality of the resulting GNN model. Our approaches decompose a node's embedding into two components: a position-specific component and a node-specific component. The position-specific component models homophily and the node-specific component models the node-to-node variation. Extensive experiments using different datasets and GNN models show that our methods are able to reduce the memory requirements by 88% to 97% while achieving, in nearly all cases, better classification accuracy than other competing approaches, including the full embeddings.

Original languageEnglish (US)
Title of host publicationProceedings - 2021 IEEE International Conference on Big Data, Big Data 2021
EditorsYixin Chen, Heiko Ludwig, Yicheng Tu, Usama Fayyad, Xingquan Zhu, Xiaohua Tony Hu, Suren Byna, Xiong Liu, Jianping Zhang, Shirui Pan, Vagelis Papalexakis, Jianwu Wang, Alfredo Cuzzocrea, Carlos Ordonez
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages11
ISBN (Electronic)9781665439022
StatePublished - 2021
Event2021 IEEE International Conference on Big Data, Big Data 2021 - Virtual, Online, United States
Duration: Dec 15 2021Dec 18 2021

Publication series

NameProceedings - 2021 IEEE International Conference on Big Data, Big Data 2021


Conference2021 IEEE International Conference on Big Data, Big Data 2021
Country/TerritoryUnited States
CityVirtual, Online

Bibliographical note

Funding Information:
This work was supported in part by NSF (1447788, 1704074, 1757916, 1834251), Army Research Office (W911NF1810344), and the Digital Technology Center at the University of Minnesota. Access to research and computing facilities was provided by the Digital Technology Center and the Minnesota Supercomputing Institute.

Publisher Copyright:
© 2021 IEEE.


  • big data
  • dimension reduction
  • embedding layer
  • graph neural networks (GNNs)
  • hashing
  • hierarchy
  • model compression
  • scalability


Dive into the research topics of 'Position-based Hash Embeddings for Scaling Graph Neural Networks'. Together they form a unique fingerprint.

Cite this