Recently, the coverage of non-protein-coding RNA in the scientific literature has expanded dramatically. While the functions for many are unknown, strong interest in this aspect of cellular biology is driving development of methods for detecting non-coding genes and transcripts. During the same period, RNA sequencings high throughput and high spatial resolution have established it as the preferred method for characterising transcriptomes. Many groups are now sequencing transcriptomes. De novo transcriptome assembly methods are being developed to address issues for which no reference genome is available. We propose a methodology that is compatible with de novo transcriptome assembly, that uses sequence, structural and genomic features to classify transcripts as non-coding vs. protein-coding RNA, and to classify different non-coding RNA types. We have applied our technique on a variety of known RNA sequences and have explored its use on contigs from the Trans-ABySS assembly pipeline for RNA-Seq data from normal mouse tissues.
Copyright is held by the author.
The author granted permission for the file to be printed and for the text to be copied and pasted.
Supervisor or Senior Supervisor
Thesis advisor: Ester, Martin
Member of collection