A Tsallis-Entropy Lens on Genetic Variation
cs.IT
/ Abstract
We introduce an information-theoretic generalization of the fixation statistic, the Tsallis-order $q$ F-statistic, $F_q$, which measures the fraction of Tsallis $q$-entropy lost within subpopulations relative to the pooled population. The family nests the classical variance-based fixation index $F_{\textbf{ST}}$ at $q{=}2$ and a Shannon-entropy analogue at $q{=}1$, whose absolute form equals the mutual information between alleles and population labels. By varying $q$, $F_q$ acts as a spectral differentiator that up-weights rare variants at low $q$, while $q{>}1$ increasingly emphasizes common variants, providing a more fine-grained view of differentiation than $F_{\textbf{ST}}$ when allele-frequency spectra are skewed. On real data (865 Oceanian genomes with 1,823,000 sites) and controlled genealogical simulations (seeded from 1,432 founders from HGDP and 1000 Genomes panels, with 322,216 sites), we show that $F_q$ in One-vs-Rest (OVR) and Leave-One-Out (LOO) modes provides clear attribution of which subpopulations drive regional structure, and sensitively timestamps isolation-migration events and founder effects. $F_q$ serves as finer-resolution complement for simulation audits and population-structure summaries.