Species tree inference from multi-labeled gene trees
Reconstructing the tree of life, which depicts the evolutionary history of today’s existing species, has been a central goal of evolutionary biology ever since Darwin. Each species is composed of a set of genes, which can be grouped into families. The DNA sequences of genes can be used to infer a gene tree for each family, and the species tree can be seen as a “summary” of these gene histories. Therefore one way of inferring a species tree is to merge multiple gene trees into a “supertree”. Many programs and software are able to combine gene trees into one, but they assume that each gene tree has at most one gene per species. However, this is unrealistic since genes undergo duplication events during the course of evolution, leading to gene trees having multiple leaves labeled by the same species. The aim of this project is to devise efficient algorithms that can combine multi-labeled gene trees and thus infer better species trees.