Skip to content

How are policy gradient methods affected by the limits of control?

Authors: Ingvar Ziemann, Anastasios Tsiamis, Henrik Sandberg, Nikolai Matni

Published: 2022 ()

arXiv: 2206.06863

Summary

Abstract

We study stochastic policy gradient methods from the perspective of control-theoretic limitations. Our main result is that ill-conditioned linear systems in the sense of Doyle inevitably lead to noisy gradient estimates. We also give an example of a class of stable systems in which policy gradient methods suffer from the curse of dimensionality. Our results apply to both state feedback and partially observed systems.