Bandit Convex Optimization for Scalable and Dynamic IoT Management

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

This paper deals with online convex optimization involving both time-varying loss functions, and time-varying constraints. The loss functions are not fully accessible to the learner, and instead only the function values (also known as bandit feedback) are revealed at queried points. The constraints are revealed after making decisions, and can be instantaneously violated, yet they must be satisfied in the long term. This setting fits nicely the emerging online network tasks such as fog computing in the Internet-of-Things, where online decisions must flexibly adapt to the changing user preferences (loss functions), and the temporally unpredictable availability of resources (constraints). Tailored for such human-in-the-loop systems where the loss functions are hard to model, a family of online bandit saddle-point (BanSaP) schemes are developed, which adaptively adjust the online operations based on (possibly multiple) bandit feedback of the loss functions, and the changing environment. Performance here is assessed by: 1) dynamic regret that generalizes the widely used static regret and 2) fit that captures the accumulated amount of constraint violations. Specifically, BanSaP is proved to simultaneously yield sublinear dynamic regret and fit, provided that the best dynamic solutions vary slowly over time. Numerical tests in fog computation offloading tasks corroborate that our proposed BanSaP approach offers competitive performance relative to existing approaches that are based on gradient feedback.

Original languageEnglish (US)
Article number8362620
Pages (from-to)1276-1286
Number of pages11
JournalIEEE Internet of Things Journal
Volume6
Issue number1
DOIs
StatePublished - Feb 2019

Fingerprint

Convex optimization
Fog
Feedback
Internet of things
Decision making
Availability

Keywords

  • Bandit convex optimization (BCO)
  • Internet of Things (IoT)
  • mobile edge computing
  • online learning
  • saddle-point method

Cite this

Bandit Convex Optimization for Scalable and Dynamic IoT Management. / Chen, Tianyi; Giannakis, Georgios B.

In: IEEE Internet of Things Journal, Vol. 6, No. 1, 8362620, 02.2019, p. 1276-1286.

Research output: Contribution to journalArticle

@article{db132c47dbf9452fade2979c728e4de7,
title = "Bandit Convex Optimization for Scalable and Dynamic IoT Management",
abstract = "This paper deals with online convex optimization involving both time-varying loss functions, and time-varying constraints. The loss functions are not fully accessible to the learner, and instead only the function values (also known as bandit feedback) are revealed at queried points. The constraints are revealed after making decisions, and can be instantaneously violated, yet they must be satisfied in the long term. This setting fits nicely the emerging online network tasks such as fog computing in the Internet-of-Things, where online decisions must flexibly adapt to the changing user preferences (loss functions), and the temporally unpredictable availability of resources (constraints). Tailored for such human-in-the-loop systems where the loss functions are hard to model, a family of online bandit saddle-point (BanSaP) schemes are developed, which adaptively adjust the online operations based on (possibly multiple) bandit feedback of the loss functions, and the changing environment. Performance here is assessed by: 1) dynamic regret that generalizes the widely used static regret and 2) fit that captures the accumulated amount of constraint violations. Specifically, BanSaP is proved to simultaneously yield sublinear dynamic regret and fit, provided that the best dynamic solutions vary slowly over time. Numerical tests in fog computation offloading tasks corroborate that our proposed BanSaP approach offers competitive performance relative to existing approaches that are based on gradient feedback.",
keywords = "Bandit convex optimization (BCO), Internet of Things (IoT), mobile edge computing, online learning, saddle-point method",
author = "Tianyi Chen and Giannakis, {Georgios B.}",
year = "2019",
month = "2",
doi = "10.1109/JIOT.2018.2839563",
language = "English (US)",
volume = "6",
pages = "1276--1286",
journal = "IEEE Internet of Things Journal",
issn = "2327-4662",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "1",

}

TY - JOUR

T1 - Bandit Convex Optimization for Scalable and Dynamic IoT Management

AU - Chen, Tianyi

AU - Giannakis, Georgios B.

PY - 2019/2

Y1 - 2019/2

N2 - This paper deals with online convex optimization involving both time-varying loss functions, and time-varying constraints. The loss functions are not fully accessible to the learner, and instead only the function values (also known as bandit feedback) are revealed at queried points. The constraints are revealed after making decisions, and can be instantaneously violated, yet they must be satisfied in the long term. This setting fits nicely the emerging online network tasks such as fog computing in the Internet-of-Things, where online decisions must flexibly adapt to the changing user preferences (loss functions), and the temporally unpredictable availability of resources (constraints). Tailored for such human-in-the-loop systems where the loss functions are hard to model, a family of online bandit saddle-point (BanSaP) schemes are developed, which adaptively adjust the online operations based on (possibly multiple) bandit feedback of the loss functions, and the changing environment. Performance here is assessed by: 1) dynamic regret that generalizes the widely used static regret and 2) fit that captures the accumulated amount of constraint violations. Specifically, BanSaP is proved to simultaneously yield sublinear dynamic regret and fit, provided that the best dynamic solutions vary slowly over time. Numerical tests in fog computation offloading tasks corroborate that our proposed BanSaP approach offers competitive performance relative to existing approaches that are based on gradient feedback.

AB - This paper deals with online convex optimization involving both time-varying loss functions, and time-varying constraints. The loss functions are not fully accessible to the learner, and instead only the function values (also known as bandit feedback) are revealed at queried points. The constraints are revealed after making decisions, and can be instantaneously violated, yet they must be satisfied in the long term. This setting fits nicely the emerging online network tasks such as fog computing in the Internet-of-Things, where online decisions must flexibly adapt to the changing user preferences (loss functions), and the temporally unpredictable availability of resources (constraints). Tailored for such human-in-the-loop systems where the loss functions are hard to model, a family of online bandit saddle-point (BanSaP) schemes are developed, which adaptively adjust the online operations based on (possibly multiple) bandit feedback of the loss functions, and the changing environment. Performance here is assessed by: 1) dynamic regret that generalizes the widely used static regret and 2) fit that captures the accumulated amount of constraint violations. Specifically, BanSaP is proved to simultaneously yield sublinear dynamic regret and fit, provided that the best dynamic solutions vary slowly over time. Numerical tests in fog computation offloading tasks corroborate that our proposed BanSaP approach offers competitive performance relative to existing approaches that are based on gradient feedback.

KW - Bandit convex optimization (BCO)

KW - Internet of Things (IoT)

KW - mobile edge computing

KW - online learning

KW - saddle-point method

UR - http://www.scopus.com/inward/record.url?scp=85047603244&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85047603244&partnerID=8YFLogxK

U2 - 10.1109/JIOT.2018.2839563

DO - 10.1109/JIOT.2018.2839563

M3 - Article

AN - SCOPUS:85047603244

VL - 6

SP - 1276

EP - 1286

JO - IEEE Internet of Things Journal

JF - IEEE Internet of Things Journal

SN - 2327-4662

IS - 1

M1 - 8362620

ER -