Abstract
The growing prevalence of Machine Learning as a Service (MLaaS) enables a wide range of applications but simultaneously raises numerous security and privacy concerns. A key issue involves the potential privacy exposure of involved parties, such as the customer's input data and the vendor's model. Consequently, two-party computing (2PC) has emerged as a promising solution to safeguard the privacy of different parties during deep neural network (DNN) inference. However, the state-of-the-art (SOTA) 2PC-DNN techniques are tailored explicitly to traditional instruction set architecture (ISA) systems like CPUs and CPU+GPU. This reliance on ISA systems significantly constrains their energy efficiency, as these architectures typically employ 32- or 64-bit instruction sets. In contrast, the possibilities of harnessing dynamic and adaptive quantization to build high-performance 2PC-DNNs remain largely unexplored due to the lack of compatible algorithms and hardware accelerators. To mitigate the bottleneck of SOTA solutions and fill the existing research gaps, this work investigates the construction of 2PC-DNNs on field programmable gate arrays (FPGAs). We introduce AQ2PNN, an end-to-end framework that effectively employs adaptive quantization schemes to develop high-performance 2PC-DNNs on FPGAs. From an algorithmic perspective, AQ2PNN introduces an innovative 2PC-ReLU method to replace Yao's Garbled Circuits (GC). Regarding hardware, AQ2PNN employs an extensive set of building blocks for linear operators, non-linear operators, and a specialized Oblivious Transfer (OT) module for secure data exchange, respectively. These algorithm-hardware co-designed modules extremely utilize the fine-grained reconfigurability of FPGAs, to adapt the data bit-width of different DNN layers in the ciphertext domain, thereby reducing communication overhead between parties without compromising DNN performance, such as accuracy. We thoroughly assess AQ2PNN using widely adopted DNN architectures, including ResNet18, ResNet50, and VGG16, all trained on ImageNet and producing quantized models. Experimental results demonstrate that AQ2PNN outperforms SOTA solutions, achieving significantly reduced communication overhead by , improved energy efficiency by 26.3 ×, and comparable or even superior throughput and accuracy.
Original language | English (US) |
---|---|
Title of host publication | Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023 |
Publisher | Association for Computing Machinery, Inc |
Pages | 628-640 |
Number of pages | 13 |
ISBN (Electronic) | 9798400703294 |
DOIs | |
State | Published - Oct 28 2023 |
Externally published | Yes |
Event | 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023 - Toronto, Canada Duration: Oct 28 2023 → Nov 1 2023 |
Publication series
Name | Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023 |
---|
Conference
Conference | 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023 |
---|---|
Country/Territory | Canada |
City | Toronto |
Period | 10/28/23 → 11/1/23 |
Bibliographical note
Publisher Copyright:© 2023 ACM.
Keywords
- Deep learning
- FPGA
- Privacy-Preserving machine learning
- Quantization
- Two-party computing