New Summit website coming in May 2021!

                   Check the SFU library website for updates.

SigTools: An exploratory visualization tool for genomic signals

Author: 
Date created: 
2021-01-20
Identifier: 
etd21288
Keywords: 
Genomic Signals
Data Visualization
Epigenomics
Histone Modifications
Chromatin State Features
Abstract: 

With the advancement of sequencing technologies, genomic data sets are constantly being expanded by high volumes of different data types. One recently introduced data type in genomic science is genomic signals, with genomic coordinates associated with a score or probability indicating some form of biological activity. An example of genomic signals isEpigenomicmarkswhich represent short-read coverage measurements over the genome, and are utilized to locate functional and nonfunctional elements in genome annotation studies. To understand and evaluate the results of such studies, one needs to explore and analyze the characteristics of the input data. Information visualization is an effective approach that leverages human visual ability in data analysis. Several visualization applications have been deployed for this purpose such as the UCSC genome browser, Deeptools, and Segtools. However, we believe there is room for improvement in terms of programming skills requirements and proposed visualizations. Sigtools is an R-based exploratory visualization package, designed to enable the users with limited programming experience to produce statistical plots of continuous genomic data. It consists of several statistical visualizations such as value distribution, correlation, and autocorrelation that provide insights regarding the behavior of a group of signals in larger regions – such as a chromosome or the whole genome – as well as visualizing them around a specific point or short region. To demonstrate Sigtools utilization, first, we visualize five histone modifications downloaded from Roadmap Epigenomics data portal and show that Sigtools accurately captures their characteristics. Then, we visualize five chromatin state features, probabilistic generated genome annotations, to display how sigtools can assist in the interpretation of new and unknown signals.

Document type: 
Thesis
Rights: 
This thesis may be printed or downloaded for non-commercial research and scholarly purposes. Copyright remains with the author.
File(s): 
Supervisor(s): 
Kay C. Wiese
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) M.Sc.
Statistics: