Abstract
Graph neural networks (GNN) inferencing involves weighting vertex feature vectors, followed by aggregating weighted vectors over a vertex neighborhood. High and variable sparsity in the input vertex feature vectors, and high sparsity and power-law degree distributions in the adjacency matrix, can lead to (a) unbalanced loads and (b) inefficient random memory accesses. GNNIE ensures load-balancing by splitting features into blocks, proposing a flexible MAC architecture, and employing load (re)distribution. GNNIE's novel caching scheme bypasses the high costs of random DRAM accesses. GNNIE shows high speedups over CPUs/GPUs; it is faster and runs a broader range of GNNs than existing accelerators.
Original language | English (US) |
---|---|
Title of host publication | Proceedings of the 59th ACM/IEEE Design Automation Conference, DAC 2022 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 565-570 |
Number of pages | 6 |
ISBN (Electronic) | 9781450391429 |
DOIs | |
State | Published - Jul 10 2022 |
Event | 59th ACM/IEEE Design Automation Conference, DAC 2022 - San Francisco, United States Duration: Jul 10 2022 → Jul 14 2022 |
Publication series
Name | Proceedings of the 59th ACM/IEEE Design Automation Conference |
---|
Conference
Conference | 59th ACM/IEEE Design Automation Conference, DAC 2022 |
---|---|
Country/Territory | United States |
City | San Francisco |
Period | 7/10/22 → 7/14/22 |
Bibliographical note
Funding Information:This work was supported in part by the Semiconductor Research Corporation (SRC).
Publisher Copyright:
© 2022 ACM.
Keywords
- GNN
- graph-specific caching
- hardware accelerator
- load balancing