Counting, Generating, Analyzing and Sampling Tree Alignments

Peer reviewed: 
Yes, item is peer reviewed.
Scholarly level: 
Faculty/Staff
Final version published as: 

Counting, Generating, Analyzing and Sampling Tree Alignments Cedric Chauve, Julien Courtiel and Yann Ponty International Journal of Foundations of Computer ScienceVol. 29, No. 05, pp. 741-767 (2018) Doi: 10.1142/S0129054118420030

Date created: 
2018-03-03
Keywords: 
Bioinformatics
Combinatorics
RNA
Secondary structure
Abstract: 

Pairwise ordered tree alignment are combinatorial objects that appear unimportant applications, such as RNA secondary structure comparison. However, the usual representation of tree alignments as supertrees is ambiguous, i.e. two distinct supertrees may induce identical sets of matches between identical pairs of trees. This ambiguity is uninformative, and detrimental to any probabilistic analysis. In this work, we consider tree alignments up to equivalence. Our first result is a precise asymptotic enumeration of tree alignments, obtained from a context-free grammar by mean of basic analytic combinatorics. Our second result focuses on alignments between two given ordered trees SS and TT. By refining our grammar to align specific trees, we obtain a decomposition scheme for the space of alignments, and use it to design an efficient dynamic programming algorithm for sampling alignments under the Gibbs-Boltzmann probability distribution. This generalizes existing tree alignment algorithms, and opens the door for a probabilistic analysis of the space of suboptimal alignments.

Language: 
English
Document type: 
Article
Rights: 
The full text of this paper will be available in March 3 2019 due to the embargo policies of World Scientific. Contact summit@sfu.ca to enquire if the full text of the accepted manuscript can be made available to you.
Statistics: