This function reads Fst from a directory with precomputed f2-statistics, and turns per-block data
into estimates and standard errors for each population pair. See details
for how Fst is computed.
fst(data, pop1 = NULL, pop2 = NULL, boot = FALSE, verbose = FALSE, ...)
Input data in one of three forms:
A 3d array of blocked Fst, output of f2_from_precomp
with fst = TRUE
A directory which contains pre-computed Fst
The prefix of genotype files
One of the following four:
NULL
: all possible population combinations will be returned
A vector of population labels. All combinations with the other pop
arguments will be returned
A matrix with population combinations to be tested, with one population per column and one
combination per row. Other pop
arguments will be ignored.
the location of a file (poplistname
or popfilename
) which specifies the populations or
population combinations to be tested. Other pop
arguments will be ignored.
A vector of population labels
If FALSE
(the default), block-jackknife resampling will be used to compute standard errors.
Otherwise, block-bootstrap resampling will be used to compute standard errors. If boot
is an integer, that number
will specify the number of bootstrap resamplings. If boot = TRUE
, the number of bootstrap resamplings will be
equal to the number of SNP blocks.
Print progress updates
Additional arguments passed to f2_from_geno
when data
is a genotype prefix
The Hudson Fst estimator used here is described in the two publications below.
For two populations with estimated allele frequency vectors p1
and p2
,
and allele count vectors n1
and n2
, it is calculated as follows:num = (p1 - p2)^2 - p1*(1-p1)/(n1-1) - p2*(1-p2)/(n2-1)
denom = p1 + p2 - 2*p1*p2
fst = mean(num)/mean(denom)
This is done independently for each SNP block, and is stored on disk for each population pair.
Jackknifing or bootstrapping across these per-block estimates yields the overall estimates and standard errors.
Reich, D. (2009) Reconstructing Indian population history Nature
Bhatia, G. (2013) Estimating and interpreting Fst: the impact of rare variants Genome Research
if (FALSE) {
pop1 = 'Denisova.DG'
pop2 = c('Altai_Neanderthal.DG', 'Vindija.DG')
fst(f2_dir, pop1, pop2)
}