Deterministic Policy Gradient Algorithms¶

Authors: David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, Martin Riedmiller

Published: 2014 (Conference Paper)

Source: International Conference on Machine Learning

Algorithm: DPG

Summary¶

Derives a deterministic policy gradient (DPG) theorem for model-free RL with continuous action spaces, showing that the deterministic policy gradient equals the expected gradient of the action-value function and is computable without integrating over actions. Introduces compatible function approximation and off-policy actor-critic algorithms, forming the theoretical foundation for DDPG and related methods.

Abstract¶

Links¶

Primary

Paper proceedings.mlr.press

Tags¶

Reinforcement learning
Policy gradient
Deterministic policy
Actor-critic
Continuous action spaces
Model-free control