Phylogenomic coestimation of gene trees and species trees

Genomes are often analyzed in a historical, evolutionary framework. To establish this framework, we have developed a new method that reconstructs both gene trees and species trees with high accuracy.

HFSP Long-Term Fellow Bastien Boussau and colleagues
authored on Tue, 30 April 2013

Extant genomes are the product of billions of years of evolution. To understand how they work, this evolutionary history needs to be taken into account. Unfortunately this history cannot be observed and, therefore, needs to be inferred. We propose a new method to infer the evolutionary history of extant genomes by jointly reconstructing the phylogeny of species and the phylogenies of the multiple genes that constitute genomes.

Our method improves upon previous approaches in its principle, its requirements, and its results. It is based on a probabilistic model and allows a fully statistical reconstruction of the species tree and of the gene trees. It does not require that the species tree and the dates of speciation are known, but only requires gene alignments as input. Simulated data show that the reconstructed gene trees are of higher quality than those obtained by state-of-the-art methods, and that the species tree is accurately recovered.

Analysis of mammalian genomes using our method shows that it improves upon widely used methods to reconstruct gene trees, while at the same time reconstructing a species tree in agreement with current views on mammalian classification. We also show that our method provides ways to reconstruct ancestral genomes more accurately than competing approaches.


Genome-scale coestimation of species and gene trees. Boussau B, Szöllosi GJ, Duret L, Gouy M, Tannier E, Daubin V. Genome Res. 2013 Feb;23(2):323-30. doi: 10.1101/gr.141978.112.

Pubmed link