Probabilistic Differential Dynamic Programming¶
Authors: Yunpeng Pan, Evangelos A. Theodorou
Published: 2014 (Conference Paper)
Source: Neural Information Processing Systems (NeurIPS)
Algorithm: PDDP
DOI: 10.5555/2969033.2969040
Summary¶
Extends DDP to handle probabilistic dynamics models, propagating uncertainty through the backward and forward passes to produce trajectories that are robust to model uncertainty and compatible with learned probabilistic dynamics.
Abstract¶
We present a data-driven, probabilistic trajectory optimization framework for systems with unknown dynamics, called Probabilistic Differential Dynamic Programming (PDDP). PDDP takes into account uncertainty explicitly for dynamics models using Gaussian processes (GPs). Based on the second-order local approximation of the value function, PDDP performs Dynamic Programming around a nominal trajectory in Gaussian belief spaces. Different from typical gradient-based policy search methods, PDDP does not require a policy parameterization and learns a locally optimal, time-varying control policy. We demonstrate the effectiveness and efficiency of the proposed algorithm using two nontrivial tasks. Compared with the classical DDP and a state-of-the-art GP-based policy search method, PDDP offers a superior combination of data-efficiency, learning speed, and applicability.
Links¶
Primary
Standard
Alternate
Tags¶
-
Trajectory optimization
-
Differential dynamic programming
-
DDP
-
Probabilistic differential dynamic programming
-
PDDP
-
Probabilistic methods
-
Learning
-
Uncertainty