P
Peter Auer
Researcher at University of Leoben
Publications - 144
Citations - 17903
Peter Auer is an academic researcher from University of Leoben. The author has contributed to research in topics: Regret & Reinforcement learning. The author has an hindex of 40, co-authored 134 publications receiving 15406 citations. Previous affiliations of Peter Auer include Vienna University of Technology & University of California, Santa Cruz.
Papers
More filters
Journal ArticleDOI
Finite-time Analysis of the Multiarmed Bandit Problem
TL;DR: This work shows that the optimal logarithmic regret is also achievable uniformly over time, with simple and efficient policies, and for all reward distributions with bounded support.
Journal ArticleDOI
The Nonstochastic Multiarmed Bandit Problem
TL;DR: A solution to the bandit problem in which an adversary, rather than a well-behaved stochastic process, has complete control over the payoffs.
Journal Article
Using confidence bounds for exploitation-exploration trade-offs
TL;DR: It is shown how a standard tool from statistics, namely confidence bounds, can be used to elegantly deal with situations which exhibit an exploitation-exploration trade-off, and improves the regret from O(T3/4) to T1/2.
Journal Article
Near-optimal Regret Bounds for Reinforcement Learning
TL;DR: For undiscounted reinforcement learning in Markov decision processes (MDPs), this paper presented a reinforcement learning algorithm with total regret O(DS√AT) after T steps for any unknown MDP with S states, A actions per state, and diameter D.
Proceedings ArticleDOI
Gambling in a rigged casino: The adversarial multi-armed bandit problem
TL;DR: A solution to the bandit problem in which an adversary, rather than a well-behaved stochastic process, has complete control over the payoffs is given.