Computing Science, School of

Receive updates for this collection

Uncovering the Subtype-Specific Temporal Order of Cancer Pathway Dysregulation

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2019-11-11
Abstract: 

Cancer is driven by genetic mutations that dysregulate pathways important for proper cell function. Therefore, discovering these cancer pathways and their dysregulation order is key to understanding and treating cancer. However, the heterogeneity of mutations between different individuals makes this challenging and requires that cancer progression is studied in a subtype-specific way. To address this challenge, we provide a mathematical model, called Subtype-specific Pathway Linear Progression Model (SPM), that simultaneously captures cancer subtypes and pathways and order of dysregulation of the pathways within each subtype. Experiments with synthetic data indicate the robustness of SPM to problem specifics including noise compared to an existing method. Moreover, experimental results on glioblastoma multiforme and colorectal adenocarcinoma show the consistency of SPM’s results with the existing knowledge and its superiority to an existing method in certain cases. The implementation of our method is available at https://github.com/Dalton386/SPM.

Document type: 
Article
File(s): 

Datasets for Face and Object Detection in Fisheye Images

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2019-11-02
Abstract: 

We present two new fisheye image datasets for training object and face detection models: VOC-360 and Wider-360. The fisheye images are created by post-processing regular images collected from two well-known datasets, VOC2012 and Wider Face, using a model for mapping regular to fisheye images implemented in Matlab. VOC-360 contains 39,575 fisheye images for object detection, segmentation, and classification. Wider-360 contains 63,897 fisheye images for face detection. These datasets will be useful for developing face and object detectors as well as segmentation modules for fisheye images while the efforts to collect and manually annotate true fisheye images are underway.

Document type: 
Article
File(s): 

Caveolae and Scaffold Detection from Single Molecule Localization Microscopy Data Using Deep Learning

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2019-08-26
Abstract: 

Caveolae are plasma membrane invaginations whose formation requires caveolin-1 (Cav1), the adaptor protein polymerase I, and the transcript release factor (PTRF or CAVIN1). Caveolae have an important role in cell functioning, signaling, and disease. In the absence of CAVIN1/PTRF, Cav1 forms non-caveolar membrane domains called scaffolds. In this work, we train machine learning models to automatically distinguish between caveolae and scaffolds from single molecule localization microscopy (SMLM) data. We apply machine learning algorithms to discriminate biological structures from SMLM data. Our work is the first that is leveraging machine learning approaches (including deep learning models) to automatically identifying biological structures from SMLM data. In particular, we develop and compare three binary classification methods to identify whether or not a given 3D cluster of Cav1 proteins is a caveolae. The first uses a random forest classifier applied to 28 hand-crafted/designed features, the second uses a convolutional neural net (CNN) applied to a projection of the point clouds onto three planes, and the third uses a PointNet model, a recent development that can directly take point clouds as its input. We validate our methods on a dataset of super-resolution microscopy images of PC3 prostate cancer cells labeled for Cav1. Specifically, we have images from two cell populations: 10 PC3 and 10 CAVIN1/PTRF-transfected PC3 cells (PC3-PTRF cells) that form caveolae. We obtained a balanced set of 1714 different cellular structures. Our results show that both the random forest on hand-designed features and the deep learning approach achieve high accuracy in distinguishing the intrinsic features of the caveolae and non-caveolae biological structures. More specifically, both random forest and deep CNN classifiers achieve classification accuracy reaching 94% on our test set, while the PointNet model only reached 83% accuracy. We also discuss the pros and cons of the different approaches.

Document type: 
Article
File(s): 

A Multi-Labeled Tree Dissimilarity Measure for Comparing “Clonal Trees” of Tumor Progression

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2019-07-27
Abstract: 

We introduce a new dissimilarity measure between a pair of “clonal trees”, each representing the progression and mutational heterogeneity of a tumor sample, constructed by the use of single cell or bulk high throughput sequencing data. In a clonal tree, each vertex represents a specific tumor clone, and is labeled with one or more mutations in a way that each mutation is assigned to the oldest clone that harbors it. Given two clonal trees, our multi-labeled tree dissimilarity (MLTD) measure is defined as the minimum number of mutation/label deletions, (empty) leaf deletions, and vertex (clonal) expansions, applied in any order, to convert each of the two trees to the maximum common tree. We show that the MLTD measure can be computed efficiently in polynomial time and it captures the similarity between trees of different clonal granularity well.

Document type: 
Article
File(s): 

A Cubic Algorithm for the Generalized Rank Median of Three Genomes

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2018-09-08
Abstract: 

The area of genome rearrangements has given rise to a number of interesting biological, mathematical and algorithmic problems. Among these, one of the most intractable ones has been that of finding the median of three genomes, a special case of the ancestral reconstruction problem. In this work we re-examine our recently proposed way of measuring genome rearrangement distance, namely, the rank distance between the matrix representations of the corresponding genomes, and show that the median of three genomes can be computed exactly in polynomial time   O(nω) , where   ω≤3 , with respect to this distance, when the median is allowed to be an arbitrary orthogonal matrix.

We define the five fundamental subspaces depending on three input genomes, and use their properties to show that a particular action on each of these subspaces produces a median. In the process we introduce the notion of M-stable subspaces. We also show that the median found by our algorithm is always orthogonal, symmetric, and conserves any adjacencies or telomeres present in at least 2 out of 3 input genomes.

We test our method on both simulated and real data. We find that the majority of the realistic inputs result in genomic outputs, and for those that do not, our two heuristics perform well in terms of reconstructing a genomic matrix attaining a score close to the lower bound, while running in a reasonable amount of time. We conclude that the rank distance is not only theoretically intriguing, but also practically useful for median-finding, and potentially ancestral genome reconstruction.

Document type: 
Article
File(s): 

jViz.RNA 4.0—Visualizing Pseudoknots and RNA Editing Employing Compressed Tree Graphs

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2019-05-06
Abstract: 

Previously, we have introduced an improved version of jViz.RNA which enabled faster and more stable RNA visualization by employing compressed tree graphs. However, the new RNA representation and visualization method required a sophisticated mechanism of pseudoknot visualization. In this work, we present our novel pseudoknot classification and implementation of pseudoknot visualization in the context of the new RNA graph model. We then compare our approach with other RNA visualization software, and demonstrate jViz.RNA 4.0’s benefits compared to other software. Additionally, we introduce interactive editing functionality into jViz.RNA and demonstrate its benefits in exploring and building RNA structures. The results presented highlight the new high degree of utility jViz.RNA 4.0 now offers. Users are now able to visualize pseudoknotted RNA, manipulate the resulting automatic layouts to suit their individual needs, and change both positioning and connectivity of the RNA molecules examined. Care was taken to limit overlap between structural elements, particularly in the case of pseudoknots to ensure an intuitive and informative layout of the final RNA structure.

Document type: 
Article
File(s): 

Integrative Inference of Subclonal Tumour Evolution from Single-Cell and Bulk Sequencing Data

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2019-06-21
Abstract: 

Understanding the clonal architecture and evolutionary history of a tumour poses one of the key challenges to overcome treatment failure due to resistant cell populations. Previously, studies on subclonal tumour evolution have been primarily based on bulk sequencing and in some recent cases on single-cell sequencing data. Either data type alone has shortcomings with regard to this task, but methods integrating both data types have been lacking. Here, we present B-SCITE, the first computational approach that infers tumour phylogenies from combined single-cell and bulk sequencing data. Using a comprehensive set of simulated data, we show that B-SCITE systematically outperforms existing methods with respect to tree reconstruction accuracy and subclone identification. B-SCITE provides high-fidelity reconstructions even with a modest number of single cells and in cases where bulk allele frequencies are affected by copy number changes. On real tumour data, B-SCITE generated mutation histories show high concordance with expert generated trees.

Document type: 
Article
File(s): 

Colourization of Dichromatic Images

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2018-09
Abstract: 

This paper explores the colour information dichromatic vision provides in terms of its potential for colourization. Given a greyscale image as input, colourization generates an RGB image as output. Since colourization works well for luminance images, how well they might work for dichromatic images? Dichromatic images are colourized using a modification of the colourization method of Iizuka et al. (Proc. SIGGRAPH 2016, 35(4):110:1-110:11). In particular, an sRGB image is converted to cone LMS and M is discarded to yield a LS image. During training, the colourization neural network is provided LS images and their corresponding LMS images, and it adjusts its weights so that M is predicted from the L and S. One does not easily recognize that a colourized dichromatic image is, in fact, based on only L and S, and is not a regular full-colour image. This is stark contrast to the dichromatic simulations of Brettel et al. (Brettel, Viénot, Mollon, JOSA A 14, 2647-2655, 1997).

Document type: 
Conference presentation
File(s): 

Colour Discrimination Ellipses Explained by Metamer Mismatching

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2018-09
Abstract: 

Many psychophysical experiments have shown that colour discrimination varies substantially with the region of colour space in which the colours reside. Many models of the experimental data have been proposed, and many uniform colour spaces have been developed that attempt to represent colour in a coordinate system such that equally discriminable colours are equal distances apart, but all of them are based on fits to the experimental data. Many provide good fits to the data, but they remain data models and do not explain why colour discrimination varies in the way it does. In contrast, this paper outlines a theory of colour discrimination based on the uncertainties reflected in the extent of metamer mismatching. The greater its extent, the more finely a colour needs to be discriminated.

Document type: 
Conference presentation
File(s): 

Spectral Gamut Mapping and Gamut Concavity

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2007-11
Abstract: 

A spectral gamut-mapping algorithm is introduced that works well for printers with a large number of inks. It finds the best mapping onto the convex hull of the printer spectral gamut while preserving color defined in CIE XYZ as much as possible. The technique employs a non-negative least-square fit. Since the gamut-mapping algorithm depends on the common assumption that the gamut is convex, an experimental study of the degree of gamut concavity is conducted. It finds that there is a significant amount of concavity, and that that the degree does not appear to change much as the number of inks is increased. Finally, the performance of the gamut-mapping algorithm and gamut coverage in spectral space is compared for 3-, 4-, 5- and 6-ink printers using both synthetic ink models and real ink data.

Document type: 
Conference presentation
File(s):