Improving the search efficiency of a covariance model for RNA homology search

Resource type
Thesis type
((Thesis)) M.Sc.
Date created
2011-10-27
Authors/Contributors
Author: Jiang, Wenbo
Abstract
In the RNA gene finding area, the covariance model, a probabilistic model based on context-free grammar, provides excellent accuracy. However, high computational complexity has limited its usefulness. This research improves the covariance model's search efficiency by building combined models for a group of different RNA families, which is selected using a hierarchical clustering strategy. Two approaches for building combined models are proposed and implemented. The first approach uses a greedy algorithm to select base pairs from each original family's secondary structure to form a new structure from which a combined covariance model is then built. The second approach constructs a series of combined partial covariance models which are built from the stem loop structural elements and are less complicated than complete models. Experimental results suggest that for most RNA gene families investigated, our combination search method successfully provides run time improvement with acceptable accuracy. Although there still exist limitations such as recall loss for a few RNA gene families, this novel combination approach has implications for future studies of reducing covariance model's search complexity.
Document
Identifier
etd6891
Copyright statement
Copyright is held by the author.
Permissions
The author granted permission for the file to be printed, but not for the text to be copied and pasted.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Wiese, Kay
Member of collection
Attachment Size
etd6891_WJiang.pdf 1.23 MB