Skip to main content

Efficient, interpretable robot learning via control-theoretic approaches

Resource type
Thesis type
(Thesis) Ph.D.
Date created
2024-01-25
Authors/Contributors
Author: Lyu, Xubo
Abstract
The realization of robotic intelligence requires the robots to autonomously sense, plan and control themselves within unknown environments. With the advancements in machine learning and deep neural networks, learning-based approaches have demonstrated great potential in enabling robots to accomplish complex tasks by actively learning from data. Reinforcement Learning (RL), as a representative technique, empowers robots to learn skills from interactive experiences of the environment via high-dimensional states and observations. However, learning-based approaches can suffer from data inefficiency, making it costly and unrealistic to apply them to real-world physical systems. Additionally, learning-based approaches lack formal mathematical tools for the analysis and interpretation of the learning results. In contrast, the long-established, classical control-theoretic approaches can model and derive control policies for dynamical systems in a data-efficient way and are equipped with well-developed theories for system and control analysis. However, they often operate under the assumption of having prior knowledge about the system dynamics and the environment and are limited to small-scale problems with low-dimensional state space. In this dissertation, we aim to combine the control-theoretic principle with modern learning-based approaches to achieve an efficient and interpretable robot learning process while scaling up the classical control techniques. In chapter \ref{chapter3}, we provide methods to improve learning efficiency with control-based value functions in the subspaces of high-dimensional problems that the learning-based approaches are tasked to solve. These value functions are efficiently computed via control approaches and can be seamlessly integrated into the learning process as novel reward and baseline functions. Then in chapter \ref{chapter4}, we study the multi-agent centralized learning efficiency under the hierarchy framework and show that a conditional policy coupled together a special form of trajectory can achieve efficient asynchronous, hierarchical decision-making. In chapter \ref{chapter5}, we incorporate a linear controller and a linear latent model into the gradient-driven end-to-end optimization over contrastive latent representation space, making the learned system and control explainable as well as extending the classical control from low to high-dimensional complex nonlinear scenarios.
Document
Extent
90 pages.
Identifier
etd22928
Copyright statement
Copyright is held by the author(s).
Permissions
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Supervisor or Senior Supervisor
Thesis advisor: Chen, Mo
Language
English
Member of collection
Download file Size
etd22928.pdf 7.86 MB

Views & downloads - as of June 2023

Views: 45
Downloads: 4