Skip to content

Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems

Authors: Andrew G. Barto, Richard S. Sutton, Charles W. Anderson

Published: 1983 (Journal Paper)

Source: IEEE Transactions on Systems, Man, and Cybernetics

Algorithm: Actor-Critic (ASE/ACE)

DOI: 10.1109/TSMC.1983.6313077

Summary

Introduced the actor-critic architecture using two neuronlike elements (ASE and ACE) to solve the cart-pole balancing task, one of the earliest demonstrations of reinforcement learning for continuous control. The ASE/ACE split directly foreshadows modern actor-critic algorithms, and the cart-pole problem became a standard RL benchmark.

Abstract

It is shown how a system consisting of two neuronlike adaptive elements can solve a difficult learning control problem. The task is to balance a pole that is hinged to a movable cart by applying forces to the cart's base. The learning system consists of a single associative search element (ASE) and a single adaptive critic element (ACE). In the course of learning to balance the pole, the ASE constructs associations between input and output by searching under the influence of reinforcement feedback, and the ACE constructs a more informative evaluation function than reinforcement feedback alone can provide.

Tags

  • Reinforcement learning

  • Actor-critic

  • Cart-pole

  • Temporal difference learning

  • Adaptive control

  • Neural networks