Mathematics, Department of

Receive updates for this collection

The Distance and Median Problems in the Single-Cut-Or-Join Model with Single-Gene Duplications

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2020-05-04
Abstract: 

Background.

In the field of genome rearrangement algorithms, models accounting for gene duplication lead often to hard problems. For example, while computing the pairwise distance is tractable in most duplication-free models, the problem is NP-complete for most extensions of these models accounting for duplicated genes. Moreover, problems involving more than two genomes, such as the genome median and the Small Parsimony problem, are intractable for most duplication-free models, with some exceptions, for example the Single-Cut-or-Join (SCJ) model.

Results.

We introduce a variant of the SCJ distance that accounts for duplicated genes, in the context of directed evolution from an ancestral genome to a descendant genome where orthology relations between ancestral genes and their descendant are known. Our model includes two duplication mechanisms: single-gene tandem duplication and the creation of single-gene circular chromosomes. We prove that in this model, computing the directed distance and a parsimonious evolutionary scenario in terms of SCJ and single-gene duplication events can be done in linear time. We also show that the directed median problem is tractable for this distance, while the rooted median problem, where we assume that one of the given genomes is ancestral to the median, is NP-complete. We also describe an Integer Linear Program for solving this problem. We evaluate the directed distance and rooted median algorithms on simulated data.

Conclusion.

Our results provide a simple genome rearrangement model, extending the SCJ model to account for single-gene duplications, for which we prove a mix of tractability and hardness results. For the NP-complete rooted median problem, we design a simple Integer Linear Program. Our publicly available implementation of these algorithms for the directed distance and median problems allow to solve efficiently these problems on large instances.

Document type: 
Article
File(s): 

A Fast Integral Equation Method for the Two-dimensional Navier-Stokes Equations

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2020-02-24
Abstract: 

The integral equation approach to partial differential equations (PDEs) provides significant advantages in the numerical solution of the incompressible Navier-Stokes equations. In particular, the divergence-free condition and boundary conditions are handled naturally, and the ill-conditioning caused by high order terms in the PDE is preconditioned analytically. Despite these advantages, the adoption of integral equation methods has been slow due to a number of difficulties in their implementation. This work describes a complete integral equation-based flow solver that builds on recently developed methods for singular quadrature and the solution of PDEs on complex domains, in combination with several more well-established numerical methods. We apply this solver to flow problems on a number of geometries, both simple and challenging, studying its convergence properties and computational performance. This serves as a demonstration that it is now relatively straightforward to develop a robust, efficient, and flexible Navier-Stokes solver, using integral equation methods.

Document type: 
Article

Breaking the Coherence Barrier: A New Theory for Compressed Sensing

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2017-02-15
Abstract: 

This paper presents a framework for compressed sensing that bridges a gap between existing theory and the current use of compressed sensing in many real-world applications. In doing so, it also introduces a new sampling method that yields substantially improved recovery over existing techniques. In many applications of compressed sensing, including medical imaging, the standard principles of incoherence and sparsity are lacking. Whilst compressed sensing is often used successfully in such applications, it is done largely without mathematical explanation. The framework introduced in this paper provides such a justification. It does so by replacing these standard principles with three more general concepts: asymptotic sparsity, asymptotic incoherence and multilevel random subsampling. Moreover, not only does this work provide such a theoretical justification, it explains several key phenomena witnessed in practice. In particular, and unlike the standard theory, this work demonstrates the dependence of optimal sampling strategies on both the incoherence structure of the sampling operator and on the structure of the signal to be recovered. Another key consequence of this framework is the introduction of a new structured sampling method that exploits these phenomena to achieve significant improvements over current state-of-the-art techniques.

Document type: 
Article
File(s): 

Social and Structural Factors Associated With Substance Use within the Support Network of Adults Living In Precarious Housing in A Socially Marginalized Neighborhood of Vancouver, Canada

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2019-09-23
Abstract: 

Background The structure of a social network as well as peer behaviours are thought to affect personal substance use. Where substance use may create health risks, understanding the contribution of social networks to substance use may be valuable for the design and implementation of harm reduction or other interventions. We examined the social support network of people living in precarious housing in a socially marginalized neighborhood of Vancouver, and analysed associations between social network structure, personal substance use, and supporters’ substance use.

Methods An ongoing, longitudinal study recruited 246 participants from four single room occupancy hotels, with 201 providing social network information aligned with a 6-month observation period. Use of tobacco, alcohol, cannabis, cocaine (crack and powder), methamphetamine, and heroin was recorded at monthly visits. Ego- and graph-level measures were calculated; the dispersion and prevalence of substances in the network was described. Logistic mixed effects models were used to estimate the association between ego substance use and peer substance use. Permutation analysis was done to test for randomness of substance use dispersion on the social network.

Results The network topology corresponded to residence (Hotel) with two clusters differing in demographic characteristics (Cluster 1 –Hotel A: 94% of members, Cluster 2 –Hotel B: 95% of members). Dispersion of substance use across the network demonstrated differences according to network topology and specific substance. Methamphetamine use (overall 12%) was almost entirely limited to Cluster 1, and absent from Cluster 2. Different patterns were observed for other substances. Overall, ego substance use did not differ over the six-month period of observation. Ego heroin, cannabis, or crack cocaine use was associated with alter use of the same substances. Ego methamphetamine, powder cocaine, or alcohol use was not associated with alter use, with the exception for methamphetamine in a densely using part of the network. For alters using multiple substances, cannabis use was associated with lower ego heroin use, and lower ego crack cocaine use. Permutation analysis also provided evidence that dispersion of substance use, and the association between ego and alter use was not random for all substances.

Conclusions In a socially marginalized neighborhood, social network topology was strongly influenced by residence, and in turn was associated with type(s) of substance use. Associations between personal use and supporter’s use of a substance differed across substances. These complex associations may merit consideration in the design of interventions to reduce risk and harms associated with substance use in people living in precarious housing.

Document type: 
Article
File(s): 

A Parallel Non-Uniform Fast Fourier Transform Library Based on an "Exponential of Semicircle" Kernel

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2019-09-19
Abstract: 

The nonuniform fast Fourier transform (NUFFT) generalizes the FFT to off-grid data. Its many applications include image reconstruction, data analysis, and the numerical solution of differential equations. We present FINUFFT, an efficient parallel library for type 1 (nonuniform to uniform), type 2 (uniform to nonuniform), or type 3 (nonuniform to nonuniform) transforms, in dimensions 1, 2, or 3. It uses minimal RAM, requires no precomputation or plan steps, and has a simple interface to several languages. We perform the expensive spreading/interpolation between nonuniform points and the fine grid via a simple new kernel---the ``exponential of semicircle"" e \beta \surd 1 - x2 in x \in [ - 1, 1]---in a cache-aware load-balanced multithreaded implementation. The deconvolution step requires the Fourier transform of the kernel, for which we propose efficient numerical quadrature. For types 1 and 2, rigorous error bounds asymptotic in the kernel width approach the fastest known exponential rate, namely that of the Kaiser--Bessel kernel. We benchmark against several popular CPU-based libraries, showing favorable speed and memory footprint, especially in three dimensions when high accuracy and/or clustered point distributions are desired.

Document type: 
Article
File(s): 

A Crime Aggregation Model on Street Networks (CAMOSNet)

Peer reviewed: 
No, item is not peer reviewed.
Date created: 
2019-09-12
Abstract: 

In this article, we develop a mathematical framework to model crime and crime concentrations on a city road network. The model proposed is an advancement to similar frameworks inspired by a model introduced by Short et al. (2008). A significant modification introduced in our model is the use of spectral graph theory to represent the road network and to simulate diffusion throughout the network. The techniques discussed are tested in a simulation model of crime applied to the city of Vancouver, BC, Canada. The simulations presented are based off of empirical data of crime in Vancouver along with its street network. Results of the simulations present crime patterns that are consistent with crime patterns observed in the city.

Document type: 
Technical Report
File(s): 

The Rooted SCJ Median with Single Gene Duplications

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2018-10-01
Abstract: 

The median problem is a classical problem in genome rearrangements. It aims to compute a gene order that minimizes the sum of the genomic distances to  k>=3  given gene orders. This problem is intractable except in the related Single-Cut-or-Join and breakpoint rearrangement models. Here we consider the rooted median problem, where we assume one of the given genomes to be ancestral to the median, which is itself ancestral to the other genomes. We show that in the Single-Cut-or-Join model with single gene duplications, the rooted median problem is NP-hard. We also describe an Integer Linear Program for solving this problem, which we apply to simulated data, showing high accuracy of the reconstructed medians.

Document type: 
Book chapter
File(s): 

Counting, Generating, Analyzing and Sampling Tree Alignments

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2018-03-03
Abstract: 

Pairwise ordered tree alignment are combinatorial objects that appear unimportant applications, such as RNA secondary structure comparison. However, the usual representation of tree alignments as supertrees is ambiguous, i.e. two distinct supertrees may induce identical sets of matches between identical pairs of trees. This ambiguity is uninformative, and detrimental to any probabilistic analysis. In this work, we consider tree alignments up to equivalence. Our first result is a precise asymptotic enumeration of tree alignments, obtained from a context-free grammar by mean of basic analytic combinatorics. Our second result focuses on alignments between two given ordered trees SS and TT. By refining our grammar to align specific trees, we obtain a decomposition scheme for the space of alignments, and use it to design an efficient dynamic programming algorithm for sampling alignments under the Gibbs-Boltzmann probability distribution. This generalizes existing tree alignment algorithms, and opens the door for a probabilistic analysis of the space of suboptimal alignments.

Document type: 
Article
File(s): 

Detecting Introgression in Anopheles Mosquito Genomes using a Reconciliation-Based Approach

Peer reviewed: 
No, item is not peer reviewed.
Date created: 
2018-08-03
Abstract: 

Introgression is an important evolutionary mechanism in insects and animals evolution. Current methods for detecting introgression rely on the analysis of phylogenetic incongruence, using either statistical tests based on expected phylogenetic patterns in small phylogenies or probabilistic modeling in a phylogenetic network context. Introgression leaves a phylogenetic signal similar to horizontal gene transfer, and it has been suggested that its detection can also be approached through the gene tree / species tree reconciliation framework, which accounts jointly for other evolutionary mechanisms such as gene duplication and gene loss. However so far the use of a reconciliation-based approach to detect introgression has not been investigated in large datasets. In this work, we apply this principle to a large dataset of Anopheles mosquito genomes. Our reconciliation-based approach recovers the extensive introgression that occurs in the gambiae complex, although with some variations compared to previous reports. Our analysis also suggests a possible ancient introgression event involving the ancestor of An. christyi.

Document type: 
Conference presentation
File(s): 

Clustered Maximum Weight Clique Problem Instances - Data set

Peer reviewed: 
Yes, item is peer reviewed.
Date created: 
2017-03
Document type: 
Dataset