FPGA Acceleration of GCN in Light of the Symmetry of Graph Adjacency Matrix

Gopikrishnan Raveendran Nair, Han Sok Suh, Mahantesh Halappanavar, Frank Liu, Jae Sun Seo, Yu Cao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Scopus citations

Abstract

Graph Convolutional Neural Networks (GCNs) are widely used to process large-scale graph data. Different from deep neural networks (DNNs), GCNs are sparse, irregular, and unstructured, posing unique challenges to hardware acceleration with regular processing elements (PEs). In particular, the adja-cency matrix of a GCN is extremely sparse, leading to frequent but irregular memory access, low spatial/temporal data locality and poor data reuse. Furthermore, a realistic graph usually consists of unstructured data (e.g., unbalanced distributions), creating significantly different processing times and imbalanced workload for each node in GCN acceleration. To overcome these challenges, we propose an end-to-end hardware-software co-design to accelerate GCNs on resource-constrained FPGAs with the features including: (1) A custom dataflow that leverages symmetry along the diagonal of the adjacency matrix to accelerate feature aggregation for undirected graphs. We utilize either the upper or the lower triangular matrix of the adjacency matrix to perform aggregation in GCN to improve data reuse. (2) Unified compute cores for both aggregation and transform phases, with full support to the symmetry-based dataflow. These cores can be dynamically reconfigured to the systolic mode for transformation or as individual accumulators for aggregation in GCN processing. (3) Preprocessing of the graph in software to rearrange the edges and features to match the custom dataflow. This step improves the regularity in memory access and data reuse in the aggregation phase. Moreover, we quantize the GCN precision from FP32 to INT8 to reduce the memory footprint without losing the inference accuracy. We implement our accelerator design in Intel Stratix10 MX FPGA board with HBM2, and demonstrate 1.3×-110.5× improvement in end-to-end GCN latency as compared to the state-of the-art FPGA implementations, on the graph datasets of Cora, Pubmed, Citeseer and Reddit.

Original languageEnglish (US)
Title of host publication2023 Design, Automation and Test in Europe Conference and Exhibition, DATE 2023 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9783981926378
DOIs
StatePublished - 2023
Externally publishedYes
Event2023 Design, Automation and Test in Europe Conference and Exhibition, DATE 2023 - Antwerp, Belgium
Duration: Apr 17 2023Apr 19 2023

Publication series

Name2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)

Conference

Conference2023 Design, Automation and Test in Europe Conference and Exhibition, DATE 2023
Country/TerritoryBelgium
CityAntwerp
Period4/17/234/19/23

Bibliographical note

Funding Information:
This work is partially supported by C-BRIC, one of six centers in JUMP, a Semiconductor Research Corporation (SRC) program sponsored by DARPA. It is also supported in part by the U.S. Department of Energy, through the Office of Advanced Scientific Computing Research’s “Data-Driven Decision Control for Complex Systems (DnC2S)” project.

Publisher Copyright:
© 2023 EDAA.

Fingerprint

Dive into the research topics of 'FPGA Acceleration of GCN in Light of the Symmetry of Graph Adjacency Matrix'. Together they form a unique fingerprint.

Cite this