This dissertation describes and presents SARNA-Predict, a novel algorithm for Ribonucleic Acid (RNA) secondary structure prediction based on Simulated Annealing (SA). SA is known to be effective in solving many different types of minimization problems and for finding the global minimum in the solution space. Based on free energy minimization techniques, SARNA-Predict heuristically searches for the structure with a free energy close to the minimum free energy G for a strand of RNA, within given constraints. Furthermore, SARNA-Predict has also been extended to predict RNA secondary structures with pseudoknots. Although dynamic programming algorithms are guaranteed to give the minimum free energy structure, the lowest free energy structure is not always the correct native structure. This is mostly due to the imperfections in the currently available thermodynamic models. Since SARNA-Predict can incorporate different thermodynamic models (INN-HB, efn2 and HotKnots) during the free energy evaluation, this feature makes SARNA-Predict superior to other algorithms such as mfold. mfold can only predict pseudoknots-free structures and cannot readily be extended to use other thermodynamic models. SARNA-Predict encodes RNA secondary structures as a permutation of helices that are pre-computed. A novel swap mutation operator and differentannealing schedules were incorporated into this original algorithm for RNA Secondary Structure Prediction. An evaluation of the performance of the new algorithm in terms of prediction accuracy is made via comparison with several state-of-the-art prediction algorithms. We measured the sensitivity and specificity using nine prediction algorithms. Four of these are dynamic programming algorithms: mfold, Pseudoknot (pknotsRE), NUPACK, and pknotsRG-mfe. The other five are heuristic algorithms: P-RnaPredict, SARNA-Predict, HotKnots, ILM, and STAR algorithms. An evaluation for the performance of the new algorithm in terms of prediction accuracy was verified with native structures. Experiments on thirty three individual known structures from eleven RNA classes (tRNA, viral RNA, anti-genomic HDV, telomerase RNA, tmRNA, rRNA, RNaseP, 5S rRNA, Group I intron 23S rRNA, Group I intron 16S rRNA, and 16S rRNA) were performed. The results presented in this dissertation demonstrate that SARNA-Predict can out-perform other state-of-the-art algorithms in terms of prediction accuracy.
Copyright is held by the author.
The author has not granted permission for the file to be printed nor for the text to be copied and pasted. If you would like a printable copy of this thesis, please contact firstname.lastname@example.org.
Member of collection