Abstract
We study stochastic linear optimization problem with bandit feedback. The set of arms take values in an N-dimensional space and belongs to a bounded polyhedron described by finitely many linear inequalities. We present an algorithm that has O(Nlog1+ϵ(T)) expected regret for any ϵ > 0 in T rounds. The algorithm alternates between exploration and exploitation phases where it plays a deterministic set of arms in the exploration phases and a greedily selected arm in the exploitation phases. The regret bound of SEE compares well to the lower bounds of Ω(N log T) that can be derived by a direct adaptation of Lai-Robbin's lower bound proof [1]. Our key insight is that for a polyhedron the optimal arm is robust to small perturbations in the reward function. Consequently, a greedily selected arm is guaranteed to be optimal when the estimation error falls below a suitable threshold. Our solution resolves a question posed by [2] that left open the possibility of efficient algorithms with logarithmic regret bounds. The simplicity of our approach allows us to derive probability one bounds on the regret, in contrast to the weak convergence results of other papers. This ensures that with probability one only finitely many errors occur in the exploitation phase. Numerical investigations show that while theoretical results are asymptotic the performance of our algorithms compares favorably to state-of-the-art algorithms in finite time as well.
Original language | English |
---|---|
Title of host publication | 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 4796-4800 |
Number of pages | 5 |
ISBN (Electronic) | 9781479999880 |
DOIs | |
State | Published - 18 May 2016 |
Externally published | Yes |
Event | 41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Shanghai, China Duration: 20 Mar 2016 → 25 Mar 2016 |
Publication series
Name | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
---|---|
Volume | 2016-May |
ISSN (Print) | 1520-6149 |
Conference
Conference | 41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 |
---|---|
Country/Territory | China |
City | Shanghai |
Period | 20/03/16 → 25/03/16 |
Bibliographical note
Publisher Copyright:© 2016 IEEE.