Probabilistic Differential Dynamic Programming¶

Authors: Yunpeng Pan, Evangelos A. Theodorou

Published: 2014 (Conference Paper)

Source: Neural Information Processing Systems (NeurIPS)

Algorithm: PDDP

DOI: 10.5555/2969033.2969040

Summary¶

Extends DDP to handle probabilistic dynamics models, propagating uncertainty through the backward and forward passes to produce trajectories that are robust to model uncertainty and compatible with learned probabilistic dynamics.

Abstract¶

We present a data-driven, probabilistic trajectory optimization framework for systems with unknown dynamics, called Probabilistic Differential Dynamic Programming (PDDP). PDDP takes into account uncertainty explicitly for dynamics models using Gaussian processes (GPs). Based on the second-order local approximation of the value function, PDDP performs Dynamic Programming around a nominal trajectory in Gaussian belief spaces. Different from typical gradient-based policy search methods, PDDP does not require a policy parameterization and learns a locally optimal, time-varying control policy. We demonstrate the effectiveness and efficiency of the proposed algorithm using two nontrivial tasks. Compared with the classical DDP and a state-of-the-art GP-based policy search method, PDDP offers a superior combination of data-efficiency, learning speed, and applicability.

Links¶

Probabilistic Differential Dynamic Programming¶

Summary¶

Abstract¶

Links¶

Primary

Standard

Alternate

Tags¶