Skip to main content

Algorithms for structural variation discovery and protein-protein interaction prediction

Resource type
Thesis type
((Thesis)/(Dissertation)) Ph.D.
Date created
This thesis has two main parts. In the first part, we will give an introduction on human genomic sequences, next-generation sequencing technologies, the structural differences among genomes of different individuals, and the 1000 Genomes Project. We will then discuss the problems of finding novel sequence insertions and mobile element insertions (e.g. Alu elements) in sequenced genomes. To identify those genomic variations with much higher accuracy than what is currently possible, we propose to move from the current model of (1) detecting genomic variations in individual nextgeneration sequenced (NGS) donor genomes independently, and (2) checking whether two or more donor genomes, indeed, agree or disagree on the variationswe will call this model the independent structural variation detection and merging (ISV&M) framework. As an alternative, we propose a new model in which genomic variation is detected among multiple genomes simultaneously. The second part of the thesis focuses on a different project which is concerned with gene tree alignment. The aim is to present the first efficient approach to the problem of determining the interaction partners among protein/domain families. This is a hard computational problem, in particular in the presence of paralogous proteins. We devise a deterministic algorithm which directly maximizes the similarity between two leaf labeled trees with edge lengths, obtaining a score optimal alignment of the two trees in question.
Copyright statement
Copyright is held by the author.
The author granted permission for the file to be printed and for the text to be copied and pasted.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Sahinalp, Cenk
Member of collection
Download file Size
etd7442_IHajirasouliha.pdf 3.11 MB

Views & downloads - as of June 2023

Views: 0
Downloads: 1