scispace - formally typeset
Journal ArticleDOI

Learning from delayed rewards

Ben Kröse
- 01 Oct 1995 - 
- Vol. 15, Iss: 4, pp 233-235
TLDR
The invention relates to a circuit for use in a receiver which can receive two-tone/stereo signals which is intended to make a choice between mono or stereo reproduction of signal A or of signal B and vice versa.
About
This article is published in Robotics and Autonomous Systems.The article was published on 1995-10-01. It has received 2861 citations till now. The article focuses on the topics: Autonomous system (mathematics) & Robotics.

read more

Citations
More filters
Journal ArticleDOI

Deep learning in neural networks

TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.
Book

Genetic Programming: On the Programming of Computers by Means of Natural Selection

TL;DR: This book discusses the evolution of architecture, primitive functions, terminals, sufficiency, and closure, and the role of representation and the lens effect in genetic programming.
Book

Dynamic Programming and Optimal Control

TL;DR: The leading and most up-to-date textbook on the far-ranging algorithmic methododogy of Dynamic Programming, which can be used for optimal control, Markovian decision problems, planning and sequential decision making under uncertainty, and discrete/combinatorial optimization.
Journal ArticleDOI

Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

TL;DR: This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units that are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reInforcement tasks, and they do this without explicitly computing gradient estimates.
Journal ArticleDOI

Ant colony system: a cooperative learning approach to the traveling salesman problem

TL;DR: The results show that the ACS outperforms other nature-inspired algorithms such as simulated annealing and evolutionary computation, and it is concluded comparing ACS-3-opt, a version of the ACS augmented with a local search procedure, to some of the best performing algorithms for symmetric and asymmetric TSPs.
Related Papers (5)