One of the biggest challenges in diagnosis, prognosis, and treatment of complex diseases like cancer is the heterogeneity of underlying disease mechanisms. This challenge has rendered the conventional and evidence-based medicine ineffective as a common remedy does not cure every patient with the same complex disease. The new paradigm in medicine, called precision or personalized medicine, is aimed at utilizing the new data collection technologies, such as high-throughput DNA sequencing, together with computational resources and algorithms, such as machine learning, to enable the scientists and physicians to understand the specifics of diseases for individuals and provide treatment strategies based on their personal characteristics. In this thesis, we provide probabilistic graphical models to decipher the heterogeneity of diseases with an emphasis on cancer, using the recently available omics data from patients. We model the heterogeneity at two levels. First, we propose unsupervised and supervised biclustering methods for detecting heterogeneity at the level of a population of patients based on their genomic, transcriptomic and clinical characteristics. The provided frameworks are also theoretically applicable to other omics data types. Second, we provide a phylogenetic analysis method to analyze the heterogeneity of a population of cells of a tumor, i.e. intra-tumor heterogeneity, based on genomic data. By transferring the evolutionary information across different tumors, this method leverages the inter-tumor heterogeneity information to infer the intra-tumor heterogeneity of individual tumors with more certainty. The proposed methods have promising performance when compared with the-state-of-the-art using both synthetic and real data.
Copyright is held by the author.
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Supervisor or Senior Supervisor
Thesis advisor: Ester, Martin
Member of collection