Mathematical phylogenetics

Developing the mathematical framework and algorithms to untangle the web of life is the central focus of this group.

Many popular software packages like (e.g. BEAST and Dendroscope) that reconstruct phylogenetic trees and networks have their foundations in deep mathematical results. In mathematical phylogenetics, we develop new mathematical tools and algorithms to solve problems related to the reconstruction and analysis of phylogenetic trees and networks.

Such networks are used to represent ancestral relationships of living entities and they have applications in evolutionary biology, linguistics, cancer research, and epidemiology. While, phylogenetic trees are traditionally used to represent ancestral relationships between organisms, recent investigations into horizontal gene transfer and hybridization, which are processes that result in mosaic patterns of relationships, challenge the model of a phylogenetic tree.

It is now widely acknowledged that graphs with cycles, called phylogenetic networks, are better suited to represent evolutionary histories because they provide a more accurate picture of the relationships between organisms. From a mathematical perspective, phylogenetic networks pose many challenging questions since they are much more entangled than trees and, consequently, make the underlying problems more complicated to address.

Calculating the minimum hybridization number under time constraints

It has recently become apparent that Bayesian phylogenetic tree inference methods have fundamental limitations for the data size that can be handled using those methods. These limitations are due to the complexity of tree space. Phylogenetic tree space possess unique geometric properties, in the sense that the geometry of the space is different in many aspects from classical and well-studied geometries. A new theory is needed to understand the structure of tree space, to properly estimate the complexity of phylogenetic algorithms, and to develop efficient and at the same time statistically consistent inference methods capable of handling contemporary genomic data sets.

We have developed an efficient approach for statistical analysis of finite sets of phylogenetic time-trees with contemporaneous taxa. This approach uses a novel geometry and algorithms to efficiently estimate various statistics of a posterior sample from the space of ultrametric trees. Furthermore, this work led to the notion of a discrete time-tree. In joint work with Chris Widden and Erick Matsen, we explored a construction of the space of discrete time-trees analogous to the classical NNI space on phylogenetic gene trees. The geometry of this space is (surprisingly) fundamentally different from that of NNI. Our goal now is to find the complexity of the discrete time-tree space. Depending on this complexity, the result might change the way people do Bayesian inference of phylogenetic time-trees.

Dr Simone Linz
School of Computer Science

Computational and geometric aspects of phylogenetic time-trees

It has recently become apparent that Bayesian phylogenetic tree inference methods have fundamental limitations with respect to the data size that can be handled using those methods. These limitations are due to the complexity of tree space. Phylogenetic tree space possess unique geometric properties, in the sense that the geometry of the space is different in many aspect from classical and well-studied geometries. New theory is needed to understand the structure of tree space, to properly estimate the complexity of phylogenetic algorithms, and to develop efficient and at the same time statistically consistent inference methods capable of handling contemporary genomic data sets.

We have developed an efficient approach for statistical analysis of finite sets of phylogenetic time-trees with contemporaneous taxa. This approach uses a novel geometry and algorithms to efficiently estimate various statistics of a posterior sample from the space of ultrametric trees. Furthermore, this work led to the notion of a discrete time-tree. In joint work with Chris Widden and Erick Matsen, we explored a construction of the space of discrete time-trees analogous to the classical NNI space on phylogenetic gene trees. The geometry of this space is (surprisingly) fundamentally different from that of NNI. Our goal now is to find the complexity of the discrete time-tree space. Depending on this complexity, the result might change the way people do Bayesian inference of phylogenetic time-trees.