On the Rank-Distance Median of 3 Permutations

Peer reviewed: 
Yes, item is peer reviewed.
Scholarly level: 
Faculty/Staff
Final version published as: 

Chindelevitch, L., Pereira Zanetti, J. & Meidanis, J. On the rank-distance median of 3 permutations. BMC Bioinformatics 19, 142 (2018). DOI: 10.1186/s12859-018-2131-4.

Date created: 
2018-05-08
Keywords: 
Genome rearrangements
Median problem
Polynomial-time solvability
Abstract: 

Background

Recently, Pereira Zanetti, Biller and Meidanis have proposed a new definition of a rearrangement distance between genomes. In this formulation, each genome is represented as a matrix, and the distance d is the rank distance between these matrices. Although defined in terms of matrices, the rank distance is equal to the minimum total weight of a series of weighted operations that leads from one genome to the other, including inversions, translocations, transpositions, and others. The computational complexity of the median-of-three problem according to this distance is currently unknown. The genome matrices are a special kind of permutation matrices, which we study in this paper.

In their paper, the authors provide an O(n3)">O(n3)O(n3) algorithm for determining three candidate medians, prove the tight approximation ratio 43">4343, and provide a sufficient condition for their candidates to be true medians. They also conduct some experiments that suggest that their method is accurate on simulated and real data.

Results

In this paper, we extend their results and provide the following:

  • Three invariants characterizing the problem of finding the median of 3 matrices

  • A sufficient condition for uniqueness of medians that can be checked in O(n)

  • A faster, O(n2)">O(n2)O(n2) algorithm for determining the median under this condition

  • A new heuristic algorithm for this problem based on compressed sensing

  • A O(n4)">O(n4)O(n4) algorithm that exactly solves the problem when the inputs are orthogonal matrices, a class that includes both permutations and genomes as special cases.

Conclusions

Our work provides the first proof that, with respect to the rank distance, the problem of finding the median of 3 genomes, as well as the median of 3 permutations, is exactly solvable in polynomial time, a result which should be contrasted with its NP-hardness for the DCJ (double cut-and-join) distance and most other families of genome rearrangement operations. This result, backed by our experimental tests, indicates that the rank distance is a viable alternative to the DCJ distance widely used in genome comparisons.

Language: 
English
Document type: 
Article
File(s): 
Statistics: