Deterministic Policy Gradient Algorithms¶
Authors: David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, Martin Riedmiller
Published: 2014 (Conference Paper)
Source: International Conference on Machine Learning
Algorithm: DPG
Summary¶
Derives a deterministic policy gradient (DPG) theorem for model-free RL with continuous action spaces, showing that the deterministic policy gradient equals the expected gradient of the action-value function and is computable without integrating over actions. Introduces compatible function approximation and off-policy actor-critic algorithms, forming the theoretical foundation for DDPG and related methods.
Abstract¶
Links¶
Primary
Tags¶
-
Reinforcement learning
-
Policy gradient
-
Deterministic policy
-
Actor-critic
-
Continuous action spaces
-
Model-free control