A non-zero variance of Tajima's estimator for two sequences even for infinitely many unlinked loci

Léandra King, John Wakeley, Shai Carmi*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

17 Scopus citations

Abstract

The population-scaled mutation rate, θ is informative on the effective population size and is thus widely used in population genetics. We show that for two sequences and n unlinked loci, the variance of Tajima's estimator (θˆ), which is the average number of pairwise differences, does not vanish even as n→∞. The non-zero variance of θˆ results from a (weak) correlation between coalescence times even at unlinked loci, which, in turn, is due to the underlying fixed pedigree shared by gene genealogies at all loci. We derive the correlation coefficient under a diploid, discrete-time, Wright–Fisher model, and we also derive a simple, closed-form lower bound. We also obtain empirical estimates of the correlation of coalescence times under demographic models inspired by large-scale human genealogies. While the effect we describe is small (Varθˆ∕θ2≈ONe −1), it is important to recognize this feature of statistical population genetics, which runs counter to commonly held notions about unlinked loci.

Original languageEnglish
Pages (from-to)22-29
Number of pages8
JournalTheoretical Population Biology
Volume122
DOIs
StatePublished - Jul 2018

Bibliographical note

Publisher Copyright:
© 2017 Elsevier Inc.

Keywords

  • Coalescent theory
  • Effective population size
  • Genealogies
  • Heterozygosity
  • Pedigrees
  • Recombination

Fingerprint

Dive into the research topics of 'A non-zero variance of Tajima's estimator for two sequences even for infinitely many unlinked loci'. Together they form a unique fingerprint.

Cite this