Skip to content

Deterministic Policy Gradient Algorithms

Authors: David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, Martin Riedmiller

Published: 2014 (Conference Paper)

Source: International Conference on Machine Learning

Algorithm: DPG

Summary

Derives a deterministic policy gradient (DPG) theorem for model-free RL with continuous action spaces, showing that the deterministic policy gradient equals the expected gradient of the action-value function and is computable without integrating over actions. Introduces compatible function approximation and off-policy actor-critic algorithms, forming the theoretical foundation for DDPG and related methods.

Abstract

Tags

  • Reinforcement learning

  • Policy gradient

  • Deterministic policy

  • Actor-critic

  • Continuous action spaces

  • Model-free control