Adversarial Linear Contextual Bandits with Graph-Structured Side Observations

Lingda Wang, Bingcong Li, Huozhi Zhou, Georgios B. Giannakis, Lav R. Varshney, Zhizhen Zhao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

This paper studies the adversarial graphical contextual bandits, a variant of adversarial multi-armed bandits that leverage two categories of the most common side information: contexts and side observations. In this setting, a learning agent repeatedly chooses from a set of K actions after being presented with a d-dimensional context vector. The agent not only incurs and observes the loss of the chosen action, but also observes the losses of its neighboring actions in the observation structures, which are encoded as a series of feedback graphs. This setting models a variety of applications in social networks, where both contexts and graph-structured side observations are available. Two efficient algorithms are developed based on EXP3. Under mild conditions, our analysis shows that for undirected feedback graphs the first algorithm, EXP3-LGC-U, achieves a sub-linear regret with respect to the time horizon and the average independence number of the feedback graphs. A slightly weaker result is presented for the directed graph setting as well. The second algorithm, EXP3-LGC-IX, is developed for a special class of problems, for which the regret is the same for both directed as well as undirected feedback graphs. Numerical tests corroborate the efficiency of proposed algorithms.

Original languageEnglish (US)
Title of host publication35th AAAI Conference on Artificial Intelligence, AAAI 2021
PublisherAssociation for the Advancement of Artificial Intelligence
Pages10156-10164
Number of pages9
ISBN (Electronic)9781713835974
DOIs
StatePublished - 2021
Event35th AAAI Conference on Artificial Intelligence, AAAI 2021 - Virtual, Online
Duration: Feb 2 2021Feb 9 2021

Publication series

Name35th AAAI Conference on Artificial Intelligence, AAAI 2021
Volume11B

Conference

Conference35th AAAI Conference on Artificial Intelligence, AAAI 2021
CityVirtual, Online
Period2/2/212/9/21

Bibliographical note

Funding Information:
Lingda Wang and Zhizhen Zhao are supported in part by Alfred P. Sloan Foundation. Bingcong Li and Georgios B. Giannakis gratefully acknowledge the support from NSF grants 1711471, and 1901134. Huozhi Zhou and Lav R. Varshney are funded in part by the IBM-Illinois Center for Cognitive Computing Systems Research (C3SR), a research collaboration as part of the IBM AI Horizons Network.

Publisher Copyright:
Copyright © 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Fingerprint

Dive into the research topics of 'Adversarial Linear Contextual Bandits with Graph-Structured Side Observations'. Together they form a unique fingerprint.

Cite this