Global Structure-from-Motion and Its Application

Date created: 
3D reconstruction
Video alignment

Structure-from-motion (SfM) is a fundamental problem in 3D computer vision, with the aim of recovering camera poses and 3D scene structure simultaneously given a set of 2D images. SfM methods can be broadly divided into incremental and global methods according to their ways to register cameras. Incremental methods register cameras one by one, while global SfM methods solve all cameras simultaneously from all available relative motions. As a result, global SfM has better potential in both reconstruction accuracy and computation efficiency than incremental SfM. In this thesis, we address two challenges of global SfM. Our goal is to propose a robust and efficient global SfM system which is applicable to all kinds of motions and datasets. The first challenge is that translation averaging in global SfM is difficult, since the input relative motion between two cameras doesn’t encode the scale information. Therefore, many existing global SfM methods don’t work for the data whose measurement graph is not parallel rigid, e.g. all cameras on the same line. To tackle this challenge, we propose a global SfM method based on a novel linear relationship within camera triplets. Our formulation encodes the scale information by the baseline length ratios within the camera triplet, which helps deal with the collinear camera motion. We further extend the linear relationship within camera triplets to linear constraints for cameras seeing a common scene point, which can improve the global translation estimation for the data with weak image association. The second challenge is that global SfM methods are fragile on noisy data, and one incorrect pairwise relationship may distort the result greatly as global SfM considers all relative relationships together. To deal with this challenge, we propose a novel global SfM pipeline where camera registration is formulated as a well-posed similarity averaging problem solved robustly with L1 optimization. What’s more, the novel pipeline makes the filtering of noisy relative poses simple and effective, which can further improve the robustness of global SfM. We show the effectiveness of our global SfM system by applying it into the video alignment problem which aims to find per-pixel correspondences between two video sequences in both spatial and temporal dimensions. Guided by the 3D information from global SfM, the proposed video registration method can align videos taken at different times with substantially different appearances, in the presence of moving objects and moving cameras with slightly different trajectories.

Document type: 
This thesis may be printed or downloaded for non-commercial research and scholarly purposes. Copyright remains with the author.
Senior supervisor: 
Ping Tan
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) Ph.D.