Abstract
Real-time analytics over data streams is often performed on edge devices, which offer privacy guarantees and lower-latency responses compared to centralized processing in the cloud. Data streams originating from sensors, mobile phones, or IoT devices are diverse and span multiple modalities, including RGB videos from cameras, time series data from wearable sensors, and audio signals. Previous research has focused on optimizing the individual analytical tasks associated with each stream, with a special emphasis on deep learning, which is computationally intensive and may be used to analyze video streams, among other things. While advances in deep learning have significantly improved inference accuracy (e.g. for computer vision tasks), state-of-the-art models are not well-suited for edge computing environments. Novel approaches are required to substantially reduce the computational burden, since edge systems are heterogeneous and typically have fewer GPU resources available for inference with deep learning models. We show that leveraging data from multiple modalities can complement or sometimes even replace resource-intensive inference, while maintaining or enhancing accuracy. We present DAISY: a Data-Aware Inference Serving sYstem which leverages multi-modal data to increase inference accuracy by dynamically selecting an appropriate model for each request. We thoroughly evaluate the proposed approach using state-of-the-art models and real-world data, which shows an increase in SLO attainment up to 60%, with a corresponding increase in inference accuracy of 5%.
Original language | English (US) |
---|---|
Title of host publication | Proceedings - 2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2024 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 408-418 |
Number of pages | 11 |
ISBN (Electronic) | 9798350395662 |
DOIs | |
State | Published - 2024 |
Event | 24th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2024 - Philadelphia, United States Duration: May 6 2024 → May 9 2024 |
Publication series
Name | Proceedings - 2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2024 |
---|
Conference
Conference | 24th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2024 |
---|---|
Country/Territory | United States |
City | Philadelphia |
Period | 5/6/24 → 5/9/24 |
Bibliographical note
Publisher Copyright:© 2024 IEEE.
Keywords
- deep learning
- edge computing
- inference serving
- multi-modal data
- video analytics