Chalmers Conferences, 9th European Conference on Mathematical and Theoretical Biology

Measures of natural selection in protein coding genes
Carina Farah Mugal, Ingemar Kaj

Last modified: 2014-04-01


The extent to which natural selection modulates protein sequence evolution has long been a matter of debate in evolutionary genetics. During the last years, there has been strong interest to identify genes under positive selection, particularly in the lineage leading to human or chimp. For this quest, the ratio of non-synonymous to synonymous divergence dN/dS is routinely used as an indicator for the mode and efficacy of selection. The model used for its estimation is based on the simplifying assumption that nucleotide substitutions happen instantaneously and draws its inference from stereotypic protein coding sequences. We formulated a novel, extended model that is firmly anchored in population genetic theory and provides the missing link between population genetics and phylogenetics. Based on this model we demonstrate that the dN/dS ratio is a biased estimator of natural selection through the contribution of ancestral and lineage-specific polymorphisms. We further provide suggestions how to overcome this bias by the use of a combination of divergence and polymorphism data, something that appears to be feasible in light of the current progress in genome sequencing technology and emerging population genomic studies for a growing number of species.


measures of natural selection; codon evolution