scispace - formally typeset
P

Peter Auer

Researcher at University of Leoben

Publications -  144
Citations -  17903

Peter Auer is an academic researcher from University of Leoben. The author has contributed to research in topics: Regret & Reinforcement learning. The author has an hindex of 40, co-authored 134 publications receiving 15406 citations. Previous affiliations of Peter Auer include Vienna University of Technology & University of California, Santa Cruz.

Papers
More filters
Journal ArticleDOI

Finite-time Analysis of the Multiarmed Bandit Problem

TL;DR: This work shows that the optimal logarithmic regret is also achievable uniformly over time, with simple and efficient policies, and for all reward distributions with bounded support.
Journal ArticleDOI

The Nonstochastic Multiarmed Bandit Problem

TL;DR: A solution to the bandit problem in which an adversary, rather than a well-behaved stochastic process, has complete control over the payoffs.
Journal Article

Using confidence bounds for exploitation-exploration trade-offs

TL;DR: It is shown how a standard tool from statistics, namely confidence bounds, can be used to elegantly deal with situations which exhibit an exploitation-exploration trade-off, and improves the regret from O(T3/4) to T1/2.
Journal Article

Near-optimal Regret Bounds for Reinforcement Learning

TL;DR: For undiscounted reinforcement learning in Markov decision processes (MDPs), this paper presented a reinforcement learning algorithm with total regret O(DS√AT) after T steps for any unknown MDP with S states, A actions per state, and diameter D.
Proceedings ArticleDOI

Gambling in a rigged casino: The adversarial multi-armed bandit problem

TL;DR: A solution to the bandit problem in which an adversary, rather than a well-behaved stochastic process, has complete control over the payoffs is given.