TY - JOUR
T1 - Slicing and Dicing the Genome
T2 - A Statistical Physics Approach to Population Genetics
AU - Maruvka, Yosef E.
AU - Shnerb, Nadav M.
AU - Solomon, Sorin
AU - Yaari, Gur
AU - Kessler, David A.
PY - 2011/4
Y1 - 2011/4
N2 - The inference of past demographic parameters from current genetic polymorphism is a fundamental problem in population genetics. The standard techniques utilize a reconstruction of the gene-genealogy, a cumbersome process that may be applied only to small numbers of sequences. We present a method that compares the total number of haplotypes (distinct sequences) with the model prediction. By chopping the DNA sequence into pieces we condense the immense information hidden in sequence space into a function for the number of haplotypes versus subsequence size. The details of this curve are robust to statistical fluctuations and are seen to reflect the process parameters. This procedure allows for a clear visualization of the quality of the fit and, crucially, the numerical complexity grows only linearly with the number of sequences. Our procedure is tested against both simulated data as well as empirical mtDNA data from China and provides excellent fits in both cases.
AB - The inference of past demographic parameters from current genetic polymorphism is a fundamental problem in population genetics. The standard techniques utilize a reconstruction of the gene-genealogy, a cumbersome process that may be applied only to small numbers of sequences. We present a method that compares the total number of haplotypes (distinct sequences) with the model prediction. By chopping the DNA sequence into pieces we condense the immense information hidden in sequence space into a function for the number of haplotypes versus subsequence size. The details of this curve are robust to statistical fluctuations and are seen to reflect the process parameters. This procedure allows for a clear visualization of the quality of the fit and, crucially, the numerical complexity grows only linearly with the number of sequences. Our procedure is tested against both simulated data as well as empirical mtDNA data from China and provides excellent fits in both cases.
KW - Galton-Watson theory
KW - Haplotype statistics
KW - Population genetics
UR - http://www.scopus.com/inward/record.url?scp=79954628410&partnerID=8YFLogxK
U2 - 10.1007/s10955-010-0113-7
DO - 10.1007/s10955-010-0113-7
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:79954628410
SN - 0022-4715
VL - 142
SP - 1302
EP - 1316
JO - Journal of Statistical Physics
JF - Journal of Statistical Physics
IS - 6
ER -