Author: Mahmoody Ghaidary, Ahmad
Genomes can be modeled by sets of adjacencies between genomic markers. There are different ways of measuring the dissimilarities between pairs of genomes, in term of genomic rearrangements. The most widely used dissimilarities are distance functions on genomes. In the present work, we consider the Double-Cut-and-Join (DCJ) distance model, denoted by dDCJ . A DCJ median of three genomes G1 , G2 , and G3 is a genome M which minimizes the sum dDCJ (M, G1 ) + dDCJ (M, G2 ) + dDCJ (M, G3 ). The problem of computing a DCJ median has been shown to be NP-hard. Currently, very few tractability results exist for this problem. The breakpoint graph is a fundamental combinatorial object for modeling and studying genomes. For example, the DCJ distance of two genomes can be obtained from the following parameters of their breakpoint graph: (i) number of vertices, (ii) number of cycles, and (iii) number of odd paths (paths with an odd number of vertices). Also ﬁnding a DCJ median for three genomes is equivalent to ﬁnding a matching in their breakpoint graph which maximizes the total number of alternating cycles and half number of odd paths. The maximum degree of a breakpoint graph of three genomes is at most 3. So ﬁnding such matching is NP-hard. We prove in this thesis that if the maximum degree is 2, the DCJ median problem is tractable. Therefore, hardness of the problem is due to the vertices of degree 3. Additionally, we introduce an FPT algorithm for the problem when the number of vertices of degree 3 is bounded.
Copyright is held by the author.
The author granted permission for the file to be printed, but not for the text to be copied and pasted.
Supervisor or Senior Supervisor
Thesis advisor: Chauve, Cedric
Thesis advisor: Stacho, Ladislav
Member of collection