Deep neural networks for robot navigation using visual teach and repeat

Resource type
Thesis type
(Thesis) M.Sc.
Date created
Navigation is an integral component of any autonomous mobile robotic system. Typical approaches to navigation consist of metric localization based on different sensory information and a plan to reach a metric target. Visual navigation is an instance of navigation utilizing only visual information for localization. This thesis demonstrates 1) Sparse semantic object detections as robust features for visual teach and repeat; 2) Deep convolutional neural network (CNN) models to extract high-level features for visual teach and repeat; 3) Visual localization-space trails for multi-agent navigation. The first application demonstrates an autonomous unmanned aerial vehicle (UAV) equipped with a monocular camera and capable of repeating a taught path using only sparse semantic object detections. This method employs a pre-trained deep model to detect semantic objects. The model is able to detect objects in a video at frame rate reliably. We show that semantic objects are sufficient to be used as landmarks for visual teach and repeat; they are repeatable, highly invariant to different lighting condition and surface appearance changes. The second application is an extension to the first application and proposes deep CNN features for visual teach and repeat. This method utilizes salient features generated by a deep model such as textures and patterns over only semantic features. Two deep neural networks are used in this system: 1) A pre-trained model capable of extracting salient features for place recognition. 2) A deep model trained from scratch and responsible for short-range visual navigation. The third application demonstrates multiple robots all equipped with a camera, collaboratively navigating in an unknown environment. This method is inspired by ant-foraging behaviour and employs deep models proposed in the second application for salient feature extraction and short-range navigation. The system is evaluated in a 3D simulation via a set of experiments that demonstrate the effectiveness of the proposed method for collaborative visual navigation.
Copyright statement
Copyright is held by the author.
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Vaughan, Richard
Member of collection