Regret Bounds for the Adaptive Control of Linear Quadratic Systems¶

Authors: Yasin Abbasi-Yadkori, Csaba Szepesvari

Published: 2011 (Conference Paper)

Source: Conference on Learning Theory

Algorithm: OFU-LQ

Summary¶

Derives O(sqrt(T)) regret bounds for adaptive control of unknown linear-quadratic systems without forced exploration, using optimistic control with a confidence set around the estimated model parameters. Establishes the first regret bound for the LQ control problem and connects online learning to adaptive control theory.

Abstract¶

Links¶

Primary

Paper proceedings.mlr.press

Alternate

proceedings.mlr.press PDF proceedings.mlr.press

Tags¶

Reinforcement learning
Linear quadratic regulator
Adaptive control
Regret bounds
Exploration-exploitation
System identification