Online learning with non-convex losses and non-stationary regret

Xiang Gao, Xiaobo Li, Shuzhong Zhang

Research output: Contribution to conferencePaperpeer-review

32 Scopus citations

Abstract

In this paper, we consider online learning with non-convex loss functions. Similar to Besbes et al. [2015] we apply non-stationary regret as the performance metric. In particular, we study the regret bounds under different assumptions on the information available regarding the loss functions. When the gradient of the loss function at the decision point is available, we propose an online normalized gradient descent algorithm (ONGD) to solve the online learning problem. In another situation, when only the value of the loss function is available, we propose a bandit online normalized gradient descent algorithm (BONGD). Under a condition to be called weak pseudo-convexity (WPC), we show that both algorithms achieve a cumulative regret bound of O(√T + VTT), where VT is the total temporal variations of the loss functions, thus establishing a sublinear regret bound for online learning with non-convex loss functions and non-stationary regret measure.

Original languageEnglish (US)
Pages235-243
Number of pages9
StatePublished - 2018
Event21st International Conference on Artificial Intelligence and Statistics, AISTATS 2018 - Playa Blanca, Lanzarote, Canary Islands, Spain
Duration: Apr 9 2018Apr 11 2018

Conference

Conference21st International Conference on Artificial Intelligence and Statistics, AISTATS 2018
Country/TerritorySpain
CityPlaya Blanca, Lanzarote, Canary Islands
Period4/9/184/11/18

Bibliographical note

Publisher Copyright:
Copyright 2018 by the author(s).

Fingerprint

Dive into the research topics of 'Online learning with non-convex losses and non-stationary regret'. Together they form a unique fingerprint.

Cite this