Computing Science - Theses, Dissertations, and other Required Graduate Degree Essays

Receive updates for this collection

Autonomy, social agency, and the integration of human and robot environments

Author: 
Date created: 
2019-12-04
Abstract: 

There is a growing wave of new autonomous robots poised for use in human environments, from the self-driving car to the delivery drone. By leaving the confinement of factories and warehouses for city streets and office buildings, the field of robotics is on a collision course with the wider public, and must prepare to contend with society as a whole’s reaction to integrating human and robot spaces. In the research community, this is leading to a convergence of the fields of Autonomy and Human-Robot Interaction, producing new and emergent issues. This thesis proposes that one such issue is the problem of a robot’s “Social Agency”, whereby navigating among humans necessarily makes robotic agents part of society in the eyes of humans, and so robots must play a social role in order to achieve the acceptance they need to be effective. After grounding this idea within existing theory, we will examine the Social Agency proposition over three parts, representing three research projects. In part one, we investigate the potential of an “incidental interface” for human-robot interaction that adapts an existing autonomous, multi-robot system to use audio for inter-robot communication, allowing human co-workers to supervise their work through casual overhearing. Part two re-implements another autonomous system for human-robot and robot-robot doorway navigation, where a user study finds a link between social acceptance of the robot, accepting the robot’s right of way, and performance. Insights gained from that study are leveraged in part three with a redesigned doorway system that makes proactive, self-confident determinations about right of way, leading to the discovery of a polarizing reaction among participants dubbed “Agency Alienation”. We close with an examination of what this development arc demonstrates about the potential and pitfalls of developing a robot’s Social Agency, and what this may mean for the future of robotics in public spaces.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Richard Vaughan
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) Ph.D.

Learning deep structured models for visual understanding

Author: 
Date created: 
2019-09-16
Abstract: 

Visual perception is one of the core building blocks of achieving general machine intelligence. Deep learning methods have expedited the progress of this field and advanced the performance of many visual recognition systems. However, there is still a clear gap between human and machine models on both the understanding of visual data and the complexity of tasks that can be performed. One of the important observations is that humans seem expert at using finite data to generalize in impressive reasoning, besides simply memorizing information. This concept can be applied to various domains and leads to some more important modeling properties, such as compositional modeling, symbolic method, structured models, etc. In this thesis, we will mainly contribute to two broad classes of tasks: recognition and generation on visual data, and show progress in modeling and application based on those concepts. In detail, firstly, we propose unified graph-based models that combine graphs into deep neural networks and further develop a meta-reasoning model for more flexible inference. Secondly, we bring symbolic methods into generative modeling and show how the concise and powerful representation in the symbolic method can be helpful in complex scene generation. Lastly, we propose a model that tries to address highly complex graphical distribution by flow-based method.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Greg Mori
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) Ph.D.

Cloud-edge collaboration for cost-effective video service provisioning

Author: 
Date created: 
2019-12-10
Abstract: 

The advances of personal computing devices and the prevalence of high-speed Internet access have pushed video streaming services into a new era. One of its representative examples is crowdsourced livecast services where numerous amateur broadcasters lively stream their video contents to viewers around the world. For video service providers, processing these multimedia contents is inherently resource-intensive, time-consuming, and consequently expensive. The demand for low latency to guarantee interactivity in these emerging services further challenges the prevalent cloud-based solutions. In this thesis, we start by revealing the potentials of offering cost-effective low-latency video services both at the cloud and the edge side through analyzing the traces collected from real-world applications. We then examine the feasibility of an instance subletting service at the cloud side, where idle cloud resources can be traded. The performance of such a service is examined from both theoretical and practical perspectives. To satisfy the low-latency requirement in the emerging interaction-rich video services, we propose a crowd transcoding solution, which fully relies on powerful users to finish transcoding. To further improve the stability of such a distributed computing system, we then propose a cloud-crowd collaborative solution, which combines redundant end viewers with the cloud to perform video processing tasks cost-effectively. Novel probabilistic auction mechanisms are designed to facilitate this solution with desirable economic properties guaranteed.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Jiangchuan Liu
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) Ph.D.

Design and evaluation of in-context retrieval exercises for informational videos

Author: 
Date created: 
2019-11-25
Abstract: 

Learners increasingly refer to online videos for learning new technical concepts, but often overlook or forget key details. We investigated how retrieval practice, a learning strategy commonly used in education, could be designed to reinforce key concepts in online videos. We began with a formative study to understand users’ perceptions of cued and free-recall retrieval techniques. We next developed REMIND, a new in context flashcard-based technique that provides expert-curated retrieval exercises in the context of a video’s playback. We evaluated this technique with 14 learners and investigated how learners engage with flashcards that are prompted automatically at pre-defined intervals or flashcards that appear on-demand. Our results overall showed that learners perceived automatically prompted flashcards to be more effortless and made the learners feel more confident about grasping key concepts in the video. However, learners found that on-demand flashcards gave them more control over their learning and allowed them to personalize their review of content. Building upon findings from the design and evaluation of REMIND, we developed HYBREID to explore the designs pace for hybrid techniques of retrieval exercises that include the favorable aspects of automatic and on-demand interactions of REMIND. Our evaluation of this initial hybrid technique provides further implications for designing hybrid techniques of retrieval exercises. We discuss the potential for hybrid retrieval techniques where automatic exercises are combined with on-demand interactions for helping learners gain control over their study, and community support is leveraged for curating content.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Parmit Chilana
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) M.Sc.

Machine and deep learning techniques applied to retail telecommunication data

Author: 
Date created: 
2019-11-07
Abstract: 

Telecommunication service providers have franchise dealers to sell their services and products to a wide range of customers. These franchise dealers are small-sized businesses working with a small financial budget and limited human resources for analyzing the performance of the business. There are numerous commercial business intelligence (BI) tools to monitor data and generate business insights. However, most of the retail entrepreneurs still use manual and/or simple techniques, having little time to dedicate to sophisticated BI tools. In this work, we investigate machine and deep learning techniques to analyze some retail telecommunication business datasets. Specifically, we examine how nearest neighbor techniques, feed forward artificial neural networks, Bayesian classifiers, and support vector machines can be used with retail telecommunication data. As indicated by our initial results we have been able to achieve precision, recall, and f-measures of 95%, for the task of classification, demonstrating that we can categorize retail telecommunication data based on the gross profit. We also developed a variant of recurrent neural networks (RNN), specifically Long Short-Term Memory (LSTM) and Bidirectional Long Short-Term Memory (BiLSTM) deep neural network models. Based on our initial results, we are able to acquire the root mean square error of 191 (training) and 281 (testing) from developed univariate models. A feed forward artificial neural network is applied to perform binary classification where we obtain an accuracy of 85% when categorizing the dataset based on the product type.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Fred Popowich
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) M.Sc.

Automated Program Hardening via Hoisted Privilege Reductions

Author: 
Date created: 
2019-09-24
Abstract: 

Privilege based security policies for programs are effective as a first line of defense against attacks. They are able to mitigate broad classes of attacks against programs, potentially saving the costs of searching for and mitigating specific vulnerabilities. Deploying such techniques, however requires expert knowledge and manual analysis of programs.We propose Passive Privilege Inference and Reducer (PPIR), a technique driven by a novel static analysis that automates the process of inferring the privileges required by a program.We develop a tool that uses this technique to infer the privileges required by a program and instrument it with a security policy to enforce the Principle of Least Privilege. We show that PPIR performs on par with handcrafted security measures while eliminating the manual burden of investigating and inserting privileges. PPIR further enables the potential to progressively reduce privileges as a program executes.

Document type: 
Thesis
File(s): 
Senior supervisor: 
William Sumner
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) M.Sc.

Distance oracles for planar graphs

Author: 
Date created: 
2019-11-22
Abstract: 

The shortest distance/path problems in planar graphs are among the most fundamental problems in graph algorithms and have numerous new applications in areas such as intelligent transportation systems which expect to get a real time answer or a distance query in large networks. A major new approach to address the new challenges is distance oracles which keep the pre-computed distance information in a data structure (called oracle) and provide an answer for a distance query with the assistance of the oracle. The preprocessing time, oracle size and query time are major criteria for evaluating two-phase algorithms. In this thesis, we first briefly review the previous work and introduce some preliminary results on exact and approximate distance oracles. Then we present our research contributions, which includes improving the preprocessing time for exact distance oracles for planar graphs with small branchwidth and providing the first constant query time (1+\epsilon)-approximate distance oracle with nearly linear size and preprocessing time for planar graphs.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Qianping Gu
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) Ph.D.

Improved deep semantic medical image segmentation

Date created: 
2019-11-27
Abstract: 

The image semantic segmentation challenge consists of classifying each pixel of an image (or just several ones) into an instance, where each instance (or category) corresponds to an object. This task is a part of the concept of scene understanding or better explaining the global context of an image. In the medical image analysis domain, image segmentation can be used for image-guided interventions, radiotherapy, or improved radiological diagnostics. Following a comprehensive review of state-of-the-art deep learning-based medical and non-medical image segmentation solutions, we make the following contributions. A deep learning-based (medical) image segmentation typical pipeline includes designing layers (A), designing an architecture (B), and defining a loss function (C). A clean/modified (D)/adversarialy perturbed (E) image is fed into a model (consisting of layers and loss function) to predict a segmentation mask for scene understanding etc. In some cases where the number of segmentation annotations is limited, weakly supervised approaches (F) are leverages. For some applications where further analysis is needed e.g., predicting volumes and objects burden, the segmentation mask is fed into another post-processing step (G). In this thesis, we tackle each of the steps (A-G). I) As for step (A and E), we studied the effect of the adversarial perturbation on image segmentation models and proposed a method that improves the segmentation performance via a non-linear radial basis convolutional feature mapping by learning a Mahalanobis-like distance function on both adversarially perturbed and unperturbed images. Our method then maps the convolutional features onto a linearly well-separated manifold, which prevents small adversarial perturbations from forcing a sample to cross the decision boundary. II) As for step (B), we propose light, learnable skip connections which learn first to select the most discriminative channels and then aggregate the selected ones as single-channel attending to the most discriminative regions of input. Compared to the heavy classical skip connections, our method reduces the computation cost and memory usage while it improves segmentation performance. III) As for step (C), we examined the critical choice of a loss function in order to handle the notorious imbalance problem that plagues both the input and output of a learning model. In order to tackle both types of imbalance during training and inference, we introduce a new curriculum learning-based loss function. Specifically, we leverage the Dice similarity coefficient to deter model parameters from being held at bad local minima and at the same time, gradually learn better model parameters by penalizing for false positives/negatives using a cross-entropy term which also helps. IV) As for step (D), we propose a new segmentation performance-boosting paradigm that relies on optimally modifying the network's input instead of the network itself. In particular, we leverage the gradients of a trained segmentation network with respect to the input to transfer it into a space where the segmentation accuracy improves. V) As for step (F), we propose a weakly supervised image segmentation model with a learned spatial masking mechanism to filter out irrelevant background signals from attention maps. The proposed method minimizes mutual information between a masked variational representation and the input while maximizing the information between the masked representation and class labels. VI) Although many semi-automatic segmentation based methods have been developed, as for step (G), we introduce a method that completely eliminates the segmentation step and directly estimates the volume and activity of the lesions from positron emission tomography scans.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Ghassan Hamarneh
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) Ph.D.

Robust neural inertial navigation in the wild

Author: 
Date created: 
2019-11-22
Abstract: 

Data-driven inertial navigation is the task of estimating of positions and orientations of a moving subject from a sequence of Inertial Measurement Unit (IMU) sensor measurements. Inertial navigation is a quintessential technology due to low cost, low energy consumption and low operating constraints of IMU sensor. However, sensor errors have forced research on inertial navigation to be limited to highly constrained use cases.We leverage on the power of machine learning and big data to loosen such constraints and estimate natural human motion in the wild. More concretely, we define our problem as estimation of relative horizontal positions and heading direction of a moving subject using the IMU sensor measurements from his phone. This research propose 1) a new benchmark containing more than 40 hours of IMU sensor data from 100 human subjects with ground-truth 3D trajectories under natural human motions; 2) novel neural inertial navigation architectures, making significant improvements for challenging motion cases; and 3) qualitative and quantitative evaluations of the competing methods over three inertial navigation benchmarks.We share the code and data to promote further research on our project website http://ronin.cs.sfu.ca

Document type: 
Thesis
File(s): 
Senior supervisor: 
Yasutaka Furukawa
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) M.Sc.

A flying grey ball multi-illuminant image dataset for colour research

Author: 
Date created: 
2019-09-30
Abstract: 

For research in the field of illumination estimation and colour constancy, there is a need for lots of ground truth measurements of the illumination colour at numerous locations within multi-illuminant scenes. A practical approach to obtaining such ground truth illumination data is presented here. The proposed method involves using a drone to carry a grey ball of known percent surface spectral reflectance throughout a scene while photographing it frequently during the flight using a calibrated camera. The captured images are then post-processed. In the post-processing step, machine vision techniques are used to detect the grey ball in each frame. The colour of the grey ball then provides the illumination colour at that location. In total, the dataset contains 30 scenes with 100 illumination measurements on average per scene. The dataset has been made publicly available for download.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Brian Funt
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) M.Sc.