Robust Saliency-Driven Quality Adaptation for Mobile 360-Degree Video Streaming

Shibo Wang, Shusen Yang, Hairong Su, Cong Zhao, Chenren Xu, Feng Qian, Nanbin Wang, Zongben Xu

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Mobile 360-degree video streaming has grown significantly in popularity but the quality of experience (QoE) suffers from insufficient and variable wireless network bandwidth. Recently, saliency-driven 360-degree streaming overcomes the buffer size limitation of head movement trajectory (HMT)-driven solutions and thus strikes a better balance between video quality and rebuffering. However, inaccurate network estimations and intrinsic saliency bias still challenge saliency-based streaming approaches, limiting further QoE improvement. To address these challenges, we design a robust saliency-driven quality adaptation algorithm for 360-degree video streaming, RoSal360. Specifically, we present a practical, tile-size-aware deep neural network (DNN) model with a decoupled self-attention architecture to accurately and efficiently predict the transmission time of video tiles. Moreover, we design a reinforcement learning (RL)-driven online correction algorithm to robustly compensate the improper quality allocations due to saliency bias. Through extensive prototype evaluations over real wireless network environments including commodity WiFi, 4 G/LTE, and 5 G links in the wild, RoSal360 significantly enhances the video quality and reduces the rebuffering ratio, thereby improving the viewer QoE, compared to the state-of-the-art algorithms.

Original languageEnglish (US)
Pages (from-to)1312-1329
Number of pages18
JournalIEEE Transactions on Mobile Computing
Volume23
Issue number2
DOIs
StatePublished - Feb 1 2024

Bibliographical note

Publisher Copyright:
IEEE

Keywords

  • 360-degree video streaming
  • Quality adaptation
  • network estimation
  • saliency

Fingerprint

Dive into the research topics of 'Robust Saliency-Driven Quality Adaptation for Mobile 360-Degree Video Streaming'. Together they form a unique fingerprint.

Cite this