Ranking Top-$k$ Trees in Tree-Based Phylogenetic Networks
/ Authors
/ Abstract
Tree-based phylogenetic networks provide a powerful model for representing complex data or non-tree-like evolution. Such networks consist of an underlying evolutionary tree called a “support tree” (also known as a “subdivision tree”) together with extra arcs added between the edges of that tree. However, a tree-based network can have exponentially many support trees, and this leads to a variety of computational problems. Recently, Hayamizu established a theory called the structure theorem for rooted binary phylogenetic networks and provided linear-time and linear-delay algorithms for different problems, such as counting, optimization, and enumeration of support trees. However, in practice, it is often more useful to search for both optimal and near-optimal solutions than to calculate only an optimal solution. In the present paper, we thus consider the following problem: Given a tree-based phylogenetic network <inline-formula><tex-math notation="LaTeX">$N$</tex-math><alternatives><mml:math><mml:mi>N</mml:mi></mml:math><inline-graphic xlink:href="hayamizu-ieq2-3229827.gif"/></alternatives></inline-formula> where each arc is weighted by its probability, compute the ranking of top-<inline-formula><tex-math notation="LaTeX">$k$</tex-math><alternatives><mml:math><mml:mi>k</mml:mi></mml:math><inline-graphic xlink:href="hayamizu-ieq3-3229827.gif"/></alternatives></inline-formula> support trees of <inline-formula><tex-math notation="LaTeX">$N$</tex-math><alternatives><mml:math><mml:mi>N</mml:mi></mml:math><inline-graphic xlink:href="hayamizu-ieq4-3229827.gif"/></alternatives></inline-formula> according to their likelihood values. We provide a linear-delay (and hence optimal) algorithm for this problem.
Journal: IEEE/ACM Transactions on Computational Biology and Bioinformatics