Genome rearrangement problems accounting for duplicate genes

Author: Mane, Aniket
We investigate certain genome rearrangement problems studied in relation to genome evolution. We introduce the SCJ-TD-FD rearrangement model to explain the directed evolution from an ancestor A to a descendant D, where D may contain multiple copies of genes from A. We study the pairwise genome distance problem that aims at finding the most parsimonious sequence of cuts, joins and single-gene duplications that transforms A to D, under this model. Next, we study the rooted median problem under the SCJ-TD-FD model, for which the problem is shown to be NP-hard. We provide an Integer Linear Program that, on simulated data, predicts an optimal median with high accuracy. Finally, we study the Small Parsimony Problem under the SCJ-TD-FD model that aims at finding the gene orders at the internal nodes of a given species tree. We define an ILP-based approach to reconstruct the ancestral gene orders and present our experiments on a data-set of Anopheles mosquito genomes.
Thesis advisor: Chauve, Cedric
