Haplotype inferring via Galled-Tree Networks

Date created: 
Haplotype inferring
Galled-tree networks
Graph theory.
Human genetic variation
Haplotype inferring
Galled-tree networks
Graph theory
Complexity theory.

Despite the great progress biomedical research has made, the root causes of common human diseases remain largely unknown. It is widely believed that mutations in DNA sequences, especially those interrupting the composition of genes, can cause illness. Searches for DNA variants for some rare monogenic diseases have been successful. However, it is more complicated to determine the genetic basis of common ``multi-factorial'' diseases, which are the result of multiple genetic variants and environmental factors. In such cases, each individual causative variant may have modest effect on the phenotype and therefore is hard to detect. Haplotypes are believed to be promising genetic markers to tackle the problem. Understanding of haplotype structures will provide fundamental insights into the pathogenesis, diagnosis and treatment of common diseases. Nevertheless, current in vitro techniques for direct haplotype sequencing are prohibitive due to their high cost and low throughput. So research has turned to inferring haplotypes computationally. In this thesis, we use the galled-tree network approach on the haplotyping problem -- the computational problem of inferring haplotypes from genotypes assuming the evolutionary history for the original haplotypes satisfies the galled-tree network model. The model is an extension to the perfect phylogeny model, which was widely utilized in haplotype inferring. While galled-tree network (GTN) model is promising in its ability to handle broader range of biological data, it also increases the complexity of the corresponding haplotyping problem. It has been an open problem as to how hard this problem is, and whether there is efficient algorithm to solve it. In this thesis work, we characterized both the problem of galled-tree network's existence for binary matrices and the problem of utilizing GTN to infer haplotypes from genotypes. Furthermore, we developed a polynomial algorithm for a special case of the galled-tree network haplotyping problem. The evaluation results are promising.

The author has placed restrictions on the PDF copy of this thesis. The PDF is not printable nor copyable. If you would like the SFU Library to attempt to contact the author to get permission to print a copy, please email your request to summit-permissions@sfu.ca.
Document type: 
Copyright remains with the author
Senior supervisor: 
School of Computing Science - Simon Fraser University
Thesis type: 
Thesis (Ph.D.)