Graph neural networks (GNN) inferencing involves weighting vertex feature vectors, followed by aggregating weighted vectors over a vertex neighborhood. High and variable sparsity in the input vertex feature vectors, and high sparsity and power-law degree distributions in the adjacency matrix, can lead to (a) unbalanced loads and (b) inefficient random memory accesses. GNNIE ensures load-balancing by splitting features into blocks, proposing a flexible MAC architecture, and employing load (re)distribution. GNNIE's novel caching scheme bypasses the high costs of random DRAM accesses. GNNIE shows high speedups over CPUs/GPUs; it is faster and runs a broader range of GNNs than existing accelerators.
|Original language||English (US)|
|Title of host publication||Proceedings of the 59th ACM/IEEE Design Automation Conference, DAC 2022|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Number of pages||6|
|State||Published - Jul 10 2022|
|Event||59th ACM/IEEE Design Automation Conference, DAC 2022 - San Francisco, United States|
Duration: Jul 10 2022 → Jul 14 2022
|Name||Proceedings of the 59th ACM/IEEE Design Automation Conference|
|Conference||59th ACM/IEEE Design Automation Conference, DAC 2022|
|Period||7/10/22 → 7/14/22|
Bibliographical noteFunding Information:
This work was supported in part by the Semiconductor Research Corporation (SRC).
© 2022 ACM.
- graph-specific caching
- hardware accelerator
- load balancing