Chalmers Conferences, 9th European Conference on Mathematical and Theoretical Biology

Limit Theorems for Renyi Entropy and Divergence with Applications to DNA Diversity Analysis
Michal Tadeusz Seweryn

Last modified: 2014-03-27

Abstract


Measurements of diversity and overlap between populations have been of interest in many areas of life sciences (in particular, in ecology) for decades. With emerging tools such as high throughput sequencing technology, the same diversity questions may be now asked on the molecular level, for example, in many areas of immunology and genomics. However, even though the sequencing experiments allow for analyzing highly diverse classes of DNA and RNA molecules, the results of such studies are often obscured by a large number of sequencing errors.

In this work we develop new information--theoretic approaches to separate the true highly diverse DNA signal from the sequencing errors (noise). To this end we study the asymptotic properties of Renyi entropy and divergence functionals for triangular arrays of row-wise independent random vectors and prove the relevant limit theorems via the usual projection and martingale methods. We apply our results to the problems of molecular diversity analysis such as comparing T-cell receptor populations, which was studied earlier [1], [2]

[1] G.A. Rempala, M. Seweryn, Methods for diversity and overlap analysis in T-cell receptor populations, Journal of Mathematical Biology (2013)

[2] A. Cebula, M. Seweryn, G.A. Rempala, (..) L. Ignatowicz, Thymus-derived regulatory T-cells control tolerance to commensal microbiota, Nature, 497 (2013), 258–262.


Keywords


next generation sequencing; T-cell receptor repertoire; diversity and overlap measurement; limit theorems; Renyi entropy and divergence