Computing Science - Theses, Dissertations, and other Required Graduate Degree Essays

Receive updates for this collection

User-assisted video reflection removal

Author: 
Date created: 
2020-04-21
Abstract: 

Reflections in videos are obstructions that often occur when videos are taken behind reflective surfaces like glass. These reflections reduce the quality of such videos, lead to information loss and degrade the accuracy of many computer vision algorithms. A video containing reflections is a combination of background and reflection layers. Thus, reflection removal is equivalent to decomposing the video into two layers. This problem is ill-posed as there is an infinite number of valid decompositions. To address this problem, we propose a user-assisted approach for video reflection removal. We rely on both spatial and temporal information and utilize sparse user hints to help improve separation. The key idea of the proposed method is to use motion cues to separate the background layer from the reflection layer with minimal user assistance. We show that user-assistance significantly improves the layer separation results. We implement and validate the proposed method through quantitative and qualitative results on real and synthetic videos. Our experiments show that the proposed method successfully removes reflection from video sequences, does not introduce visual distortions, and significantly outperforms the state-of-the-art reflection removal methods in the literature.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Mohamed Hefeeda
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) M.Sc.

MCS2: Minimal coordinated supports for fast enumeration of minimal cut sets in metabolic networks

Date created: 
2019-08-21
Abstract: 

Constraint-based modeling of metabolic networks helps researchers gain insight into the metabolic processes of many organisms, both prokaryotic and eukaryotic. Minimal Cut Sets (MCSs) are minimal sets of reactions whose inhibition blocks a target reaction in a metabolic network. Most approaches for finding the MCSs in constrained-based models require, either as an intermediate step or as a byproduct of the calculation, the computation of the set of elementary flux modes (EFMs), a convex basis for the valid flux vectors in the network. Recently, Ballerstein et al. proposed a method for computing the MCSs of a network without first computing its EFMs, by creating a dual network whose EFMs are a superset of the MCSs of the original network. However, their dual network is always larger than the original network and depends on the target reaction. Here we propose the construction of a different dual network, which is typically smaller than the original network and is independent of the target reaction, for the same purpose. We prove the correctness of our approach, MCS2, and describe how it can be modified to compute the few smallest MCSs for a given target reaction.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Leonid Chindelevitch
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) M.Sc.

Genotyping and copy number analysis of immunoglobin heavy chain variable genes using long reads

Author: 
Date created: 
2019-08-21
Abstract: 

One of the remaining challenges to describing an individual's genetic variation lies in the highly heterogenous and complex genomic regions which imped the use of classical reference-guided mapping and assembly approaches. Once such region is the Immunoglobulin heavy chain locus (IGH), which is critical for the development of antibodies and the immune system. Presented is ImmunoTyper, the first PacBio-based genotyping and copy-number calling tool specifically designed for IGH V genes (IGHV). ImmunoTyper's multi-stage clustering and combinatorial optimization approach is demonstrated to be the most comprehensive IGHV genotyping approach published to date, through validation using gold-standard IGH reference sequence. This preliminary work establishes the feasibility of fine-grained genotype and copy number analysis using error-prone long reads in complex multi-gene loci, and opens the door for in-depth investigation into IGHV heterogeneity using accessible and increasingly common whole genome sequence

Document type: 
Thesis
File(s): 
Senior supervisor: 
Maxwell Libbrecht
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) M.Sc.

Face style transfer and removal with generative adversarial network

Author: 
Date created: 
2020-05-04
Abstract: 

Style transfer plays a vital role in image manipulation and creates new artistic works in different artistic styles from existing photographs. While style transfer has been widely studied, recovering photo-realistic images from corresponding artistic works has not been fully investigated. And all previous work considers style transfer and removal as separate problems. In this thesis, we present a method to transfer the style of a stylized face to a different face without style and recover photo-realistic face from the same stylized face image simultaneously. Here, style refers to the local patterns or textures of the stylized images. Style transfer gives a new way for artistic creation while style removal can be beneficial for face verification, photo-realistic content editing or facial analysis. Our approach contains two components: the Style Transfer Network (STN) and the Style Removal Network (SRN). STN renders the style of the stylized image to the non-stylized image, and the SRN is designed to remove the style of a stylized photo. By applying the two networks successively to an original input photo, the output should match the input photo. The experiment results in a variety of portraits and styles demonstrate our approach's effectiveness.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Ze-Nian Li
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) M.Sc.

WiFi-based activity recognition with deep learning

Author: 
Date created: 
2020-05-26
Abstract: 

Human activity recognition is drawing escalating attention in recent years in both academia and industry due to the potentials in bracing such a broad range of Internet of Things (IoT) applications as health diagnosis, human-machine interactions, safety surveillance, and so on. Among many forms of sensing technologies, e.g., using cameras, wearable sensors, and RFIDs, WiFi-based activity recognition is of particular interest given its ubiquity, low cost, device-free experience, and low dependence. Generally, people's motions will affect the reflected WiFi signals and incur specific radio patterns. Through profiling these specific patterns, we are able to recognize the original activities. Many existing works have reported relatively good activity recognition performance in dedicated scenarios; yet their performance degrades much in the practical complex applications with various impact factors, such as the co-channel interference, spatial diversity, and diverse environments, making existing WiFi-based solutions far from being satisfactory. In this thesis, we aim to address the existing key challenges and develop accurate, reliable, and adaptive WiFi-based human activity recognition systems. We argue that the integration of advanced deep learning techniques into the activity recognition will bring new opportunities towards our goal. Along this end, we first propose CSAR, a channel selective activity recognition framework that conquers the channel quality problem by active channel hopping and channel combination. We then develop WiSDAR, which constructs multiple separated antenna pairs and obtains features from multiple spatial dimensions to solve the spatial diversity problem. We at last investigate the activity recognition in a more compact in-car scenario and present WiCAR, a WiFi-based in-car activity recognition framework that leverages domain adaptation to remove the environment-specific information in the received signals while retaining the activity-related features for adaptive recognition. We have conducted extensive evaluations and the performance results further demonstrate the superiority of our frameworks over the state-of-the-art solutions.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Jiangchuan Liu
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) Ph.D.

Learning shape-to-shape transformation

Author: 
Date created: 
2020-05-14
Abstract: 

Many problems in computer graphics and geometric modeling, e.g., skeletonization, surface completion, and shape style transfer, can be posed as a problem of shape-to-shape transformation. In this thesis, we are interested in learning general-purpose shape transform, e.g., between 3D objects and their skeletons, between chairs and tables, and between letters of two different font styles, etc. With a point-based shape representation, we explore the problem of learning general-purpose shape-to-shape transformation, under two different settings: i). having shape-level supervision, ii). unsupervised. We present P2P-NET, a deep neural network, for learning shape transform under shape-level supervision. It is trained on paired shapes from the source and target domains, but without relying on point-to-point correspondences between the source and target point sets(i.e., point-level supervision). The architecture of the P2P-NET is that of a bi-directional point displacement network, which transforms a source point set to a prediction of the target point set with the same cardinality, and vice versa, by applying point-wise displacement vectors learned from data. For an unsupervised setting, we introduce LOGAN, a deep neural network aimed at learning general-purpose shape transforms from unpaired shape domains. It consists of an autoencoder to encode shapes from the two input domains into a common latent space, where the latent codes are overcomplete representations for shapes. The translator is based on a generative adversarial network (GAN), operating in the latent space, where an adversarial loss enforces cross-domain translation while a feature preservation loss ensures that the right shape features are preserved for a natural shape transform. We conduct ablation studies to validate each of our key designs and demonstrate superior capabilities in shape transforms on a variety of examples over baselines and state-of-the-art approaches. Several different applications enabled by our general-purpose shape transform solutions are presented to highlight the effectiveness, versatility, and potential of our networks in solving a variety of shape-to-shape transformation problems.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Hao (Richard) Zhang
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) Ph.D.

Computational methods for analysis of single molecule sequencing data

Author: 
Date created: 
2020-03-26
Abstract: 

Next-generation sequencing (NGS) technologies paved the way to a significant increase in the number of sequenced genomes, both prokaryotic and eukaryotic. This increase provided an opportunity for considerable advancement in genomics and precision medicine. Although NGS technologies have proven their power in many applications such as de novo genome assembly and variation discovery, computational analysis of the data they generate is still far from being perfect. The main limitation of NGS technologies is their short read length relative to the lengths of (common) genomic repeats. Today, newer sequencing technologies (known as single-molecule sequencing or SMS) such as Pacific Biosciences and Oxford Nanopore are producing significantly longer reads, making it theoretically possible to overcome the difficulties imposed by repeat regions. For instance, for the first time, a complete human chromosome was fully assembled using ultra-long reads generated by Oxford Nanopore. Unfortunately, long reads generated by SMS technologies are characterized by a high error rate, which prevents their direct utilization in many of the standard downstream analysis pipelines and poses new computational challenges. This motivates the development of new computational tools specifically designed for SMS long reads. In this thesis, we present three computational methods that are tailored for SMS long reads. First, we present lordFAST, a fast and sensitive tool for mapping noisy long reads to a reference genome. Mapping sequenced reads to their potential genomic origin is the first fundamental step for many computational biology tasks. As an example, in this thesis, we show the success of lordFAST to be employed in structural variation discovery. Next, we present the second tool, CoLoRMap, which tackles the high level of base-level errors in SMS long reads by providing a means to correct them using a complementary set of NGS short reads. This integrative use of SMS and NGS data is known as hybrid technique. Finally, we introduce HASLR, an ultra-fast hybrid assembler that uses reads generated by both technologies to efficiently generate accurate genome assemblies. We demonstrate that HASLR is not only the fastest assembler but also the one with the lowest number of misassemblies on all the samples compared to other tested assemblers. Furthermore, the generated assemblies in terms of contiguity and accuracy are on par with the other tools on most of the samples.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Binay Bhattacharya
S. Cenk Sahinalp; Cedric Chauve; Faraz Hach
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) Ph.D.

Deconstructing supertagging into multi-task sequence prediction

Author: 
Date created: 
2020-04-07
Abstract: 

Supertagging is a sequence prediction task where each word is assigned a complex syntactic structure called a supertag. In this thesis, we propose a novel multi-task learning approach for Tree Adjoining Grammar~(TAG) supertagging by deconstructing these complex supertags to a set of related but auxiliary sequence prediction tasks, which can best represent the structural information of each supertag. Our multi-task prediction framework is trained over the same training data used to train the original supertagger, where each auxiliary task provides an alternative view of the original prediction task. Our experimental results show that our multi-task approach significantly improves TAG supertagging with a new state-of-the-art accuracy score of 91.39% on the Penn treebank supertagging dataset. We also show consistent improvement of around 0.4% in tagging accuracy by applying our multi-task prediction framework into various neural supertagging models without using any additional data resources.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Anoop Sarkar
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) M.Sc.

Implementing Belnap-logical conflation and implication operators in answer set programming

Author: 
Date created: 
2020-03-19
Abstract: 

Two types of negation are allowed in answer set programming (ASP), default negation and classical negation. When using two-valued logic as its basis, the presence of classical negation in ASP can lead to gluts (both true and false) and gaps (neither true nor false), which are handled in unintuitive ways. Belnap’s four-valued logic, with gluts and gaps as truth values, is a more intuitive basis for ASP. This thesis examines the intuition behind Belnap logic, showing that the conflation operator, which has no obvious intuitive meaning, is central to the representation of default negation in Belnap logic. There is no single correct implication operator in Belnap logic that can be used in ASP rules, so we examine a number of different implication operators in Belnap logic, before presenting a new implication operator that generalizes them and showing how this implication operator can be implemented in ASP without changing its specifications.

Document type: 
Thesis
File(s): 
Senior supervisor: 
James Delgrande
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) M.Sc.

Unsupervised continuous feature annotation of the human genome

Date created: 
2020-04-14
Abstract: 

Genome annotation methods are widely used to understand the function of the genome. For example, they can be used to identify the activity of a genomic position that is associated with a disease. Existing genome annotation methods produce discrete annotations that assign a single label to each genomic position. However, these discrete annotation methods have several limitations. For example, these methods cannot easily represent varying strengths of genomic elements, and they cannot easily represent combinatorial elements that simultaneously exhibit multiple types of activity. To remedy these limitations, an annotation strategy is proposed that instead outputs a vector of chromatin state features at each position. Also a method, epigenome-ssm is proposed to annotate the genome with chromatin state features. It is shown that chromatin state features from epigenome-ssm are more useful for several downstream applications than both continuous and discrete alternatives, including their ability to identify expressed genes and enhancers.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Kay C Wiese
Maxwell Libbrecht
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) M.Sc.