Learning from delayed rewards

doi:10.1016/0921-8890(95)00026-C

Journal ArticleDOI

Learning from delayed rewards

Ben Kröse

- 01 Oct 1995 -

Robotics and Autonomous Systems

- Vol. 15, Iss: 4, pp 233-235

TLDR

The invention relates to a circuit for use in a receiver which can receive two-tone/stereo signals which is intended to make a choice between mono or stereo reproduction of signal A or of signal B and vice versa.

About:

This article is published in Robotics and Autonomous Systems.The article was published on 1995-10-01. It has received 2861 citations till now. The article focuses on the topics: Autonomous system (mathematics) & Robotics.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Deep learning in neural networks

Jürgen Schmidhuber

- 01 Jan 2015 -

Neural Networks

TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

...read moreread less

Book

Genetic Programming: On the Programming of Computers by Means of Natural Selection

John R. Koza

TL;DR: This book discusses the evolution of architecture, primitive functions, terminals, sufficiency, and closure, and the role of representation and the lens effect in genetic programming.

...read moreread less

Book

Dynamic Programming and Optimal Control

Dimitri P. Bertsekas

TL;DR: The leading and most up-to-date textbook on the far-ranging algorithmic methododogy of Dynamic Programming, which can be used for optimal control, Markovian decision problems, planning and sequential decision making under uncertainty, and discrete/combinatorial optimization.

...read moreread less

Journal ArticleDOI

Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

Ronald J. Williams

- 01 May 1992 -

Machine Learning

TL;DR: This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units that are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reInforcement tasks, and they do this without explicitly computing gradient estimates.

...read moreread less

Journal ArticleDOI

Ant colony system: a cooperative learning approach to the traveling salesman problem

Marco Dorigo, +1 more

- 01 Apr 1997 -

IEEE Transactions on Evolutionary Comput...

TL;DR: The results show that the ACS outperforms other nature-inspired algorithms such as simulated annealing and evolutionary computation, and it is concluded comparing ACS-3-opt, a version of the ACS augmented with a local search procedure, to some of the best performing algorithms for symmetric and asymmetric TSPs.

...read moreread less