Sampled Differential Dynamic Programming¶
Authors: Joose Rajamäki, Kourosh Naderi, Ville Kyrki, Perttu Hämäläinen
Published: 2016 (Conference Paper)
Source: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Algorithm: SaDDP
DOI: 10.1109/IROS.2016.7759229
Summary¶
Combines DDP and path integral control by estimating the DDP Hessian via zero-order sampling rather than analytical differentiation, yielding a trajectory optimizer that blends the structure and efficiency of DDP with the robustness and simplicity of sampling-based methods. Think of it as Hessian-Free optimization (using a zero-order oracle to estimate the Hessian c.f. "Deep Learning via Hessian-free Optimization" by James Martens, 2010) specialized to trajectory optimization problems.
Abstract¶
We present SaDDP, a sampled version of the widely used differential dynamic programming (DDP) control algorithm. We contribute through establishing a novel connection between two major branches of robotics control research, that is, gradient-based methods such as DDP, and Monte Carlo methods such as path integral control (PI) that utilize random simulated trajectory rollouts. One of our key observations is that the Taylor-expansion, central to DDP, can be reformulated in terms of second-order statistics computed of the sampled trajectories. SaDDP makes few assumptions about the controlled system and works with black-box dynamics simulations with non-smooth contacts. Our simulation results show that the method outperforms PI and CMA-ES in both a simple linear-quadratic problem, and a multilink arm reaching task with obstacles.
Links¶
Primary
Standard
Alternate
Tags¶
-
Trajectory optimization
-
Differential dynamic programming
-
Sampled differential dynamic programming
-
Sampling-based control
-
Sampling-based planning
-
Hessian-free optimization
-
Path integral control
-
Evolution strategies
-
CMA-ES
-
Taylor expansion
-
Gradient-based