Skip to content

Regret Bounds for the Adaptive Control of Linear Quadratic Systems

Authors: Yasin Abbasi-Yadkori, Csaba Szepesvari

Published: 2011 (Conference Paper)

Source: Conference on Learning Theory

Algorithm: OFU-LQ

Summary

Derives O(sqrt(T)) regret bounds for adaptive control of unknown linear-quadratic systems without forced exploration, using optimistic control with a confidence set around the estimated model parameters. Establishes the first regret bound for the LQ control problem and connects online learning to adaptive control theory.

Abstract

Tags

  • Reinforcement learning

  • Linear quadratic regulator

  • Adaptive control

  • Regret bounds

  • Exploration-exploitation

  • System identification