A key challenge in cancer genomics is the identification and prioritization of genomic aberrations that potentially act as drivers of cancer. HIT’nDRIVE is a combinatorial method to identify aberrant genes that can collectively influence possibly distant “outlier” genes based on the “random-walk facility location” (RWFL) problem on an interaction network. RWFL uses “multi-hitting time”, the expected minimum length of a random walk originating from any aberrant gene towards an outlier. HIT’nDRIVE aims to find the smallest set of aberrant genes from which one can reach outliers within desired multi-hitting time. It estimates multi-hitting time based on the independent hitting times and reduces the RWFL to a weighted multi-set cover problem, which it solves as an integer linear program (ILP). We apply HIT’nDRIVE to identify aberrant genes that potentially act as drivers in a cancer data set and make phenotype predictions using only the potential drivers, more accurately than alternative approaches.
Copyright is held by the author.
The author granted permission for the file to be printed and for the text to be copied and pasted.
Supervisor or Senior Supervisor
Thesis advisor: Sahinalp, Cenk
Member of collection