Resource type
Thesis type
(Thesis) M.Sc.
Date created
2018-11-30
Authors/Contributors
Author: Douglas, Matthew
Abstract
Annotating the genome of the nematode Caenorhabditis elegans has been an ongoing challenge for the last twenty years. Studies have leveraged high-throughput RNA-sequencing (RNA-Seq) to uncover evidence for thousands of novel splicing events, indicating that the current annotations are far from complete. Yet, there is some uncertainty whether the many rare events represent functional transcripts, or simply biological noise. We developed a method that leverages the wealth of publicly available RNA-Seq data to perform a quantitative evaluation of the completeness of the current C. elegans genome annotation. We identified 134,949 and 204,812 novel high-quality introns and exons, respectively. We find that many introns and exons are rarely expressed overall, but strongly expressed at specific developmental stages suggesting a functional role. We assembled a high-quality set of 72,274 protein-coding transcripts to show that only a fraction of the coding transcriptome of C. elegans is represented in the current genome annotation.
Document
Identifier
etd20050
Copyright statement
Copyright is held by the author.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Chen, Jack
Member of collection
Download file | Size |
---|---|
etd20050.pdf | 4.53 MB |