Skip to main content

Learning cooperation for partially observable multi-agent path finding

Resource type
Thesis type
(Thesis) M.Sc.
Date created
Author: Lin, Qiushi
In many real-world applications of multi-agent systems, decentralized agents have to cooperate to plan their collision-free paths under partial observation. Recently, many works have introduced multi-agent reinforcement learning (RL) to solve this partially observable variant of Multi-Agent Path Finding (MAPF) by learning homogeneous policies through centralized training and decentralized execution. However, complex multi-agent cooperation towards optimizing one or multiple objectives is hard to achieve by existing learning-based methods due to the curse of multiagency. In this thesis, we aim to design algorithms that learn multi-agent cooperation for path planning towards various objectives. To tackle single-objective cooperation for partially observable MAPF, we propose Soft Actor-Critic with Heuristic-Based Attention (SACHA), a novel multi-agent actor-critic RL framework for the learned model to generalize among multiple instances. Moreover, we investigate the decentralized variant of another similar problem, Moving Agents in Formation (MAiF), that combines path planning with formation control. To learn bi-objective cooperation, we propose Mean Field Control with Envelop Q-learning (MFC-EQ), a scalable and adaptable learning framework for balancing two specific objectives among decentralized agents. We provide theoretical analysis to show the effectiveness of our methods and empirical evaluation to demonstrate that our methods can outperform baselines for numerous instances in various environments.
42 pages.
Copyright statement
Copyright is held by the author(s).
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Supervisor or Senior Supervisor
Thesis advisor: Ma, Hang
Member of collection
Download file Size
etd22853.pdf 3.17 MB

Views & downloads - as of June 2023

Views: 9
Downloads: 0