qpwave
compares two sets of populations (left
and right
) to each other. It estimates a lower bound on the number of admixtue waves that went from left
into right
, by comparing a matrix of f4-statistics to low-rank approximations. For a rank of 0 this is equivalent to testing whether left
and right
form clades relative to each other.
qpwave(
data,
left,
right,
fudge = 1e-04,
boot = FALSE,
constrained = FALSE,
cpp = TRUE,
verbose = TRUE
)
The input data in the form of:
A 3d array of blocked f2 statistics, output of f2_from_precomp
or extract_f2
A directory with f2 statistics
The prefix of a genotype file
Left populations (sources)
Right populations (outgroups)
Value added to diagonal matrix elements before inverting
If FALSE
(the default), block-jackknife resampling will be used to compute standard errors.
Otherwise, block-bootstrap resampling will be used to compute standard errors. If boot
is an integer, that number
will specify the number of bootstrap resamplings. If boot = TRUE
, the number of bootstrap resamplings will be
equal to the number of SNP blocks.
Constrain admixture weights to be non-negative
Use C++ functions. Setting this to FALSE
will be slower but can help with debugging.
Print progress updates
qpwave
returns a list with up to two data frames describing the model fit:
f4
A data frame with estimated f4-statistics
rankdrop
: A data frame describing model fits with different ranks, including p-values for the overall fit
and for nested models (comparing two models with rank difference of one). A model with L
left populations and R
right populations has an f4-matrix of dimensions (L-1)*(R-1)
. If no two left population form a clade with respect to all right populations, this model will have rank (L-1)*(R-1)
.
f4rank
: Tested rank
dof
: Degrees of freedom of the chi-squared null distribution: (L-1-f4rank)*(R-1-f4rank)
chisq
: Chi-sqaured statistic, obtained as E'QE
, where E
is the difference between estimated and fitted f4-statistics, and Q
is the f4-statistic covariance matrix.
p
: p-value obtained from chisq
as pchisq(chisq, df = dof, lower.tail = FALSE)
dofdiff
: Difference in degrees of freedom between this model and the model with one less rank
chisqdiff
: Difference in chi-squared statistics
p_nested
: p-value testing whether the difference between two models of rank difference 1 is significant
Patterson, N. et al. (2012) Ancient admixture in human history. Genetics
Haak, W. et al. (2015) Massive migration from the steppe was a source for Indo-European languages in Europe. Nature (SI 10)
left = c('Altai_Neanderthal.DG', 'Vindija.DG')
right = c('Chimp.REF', 'Mbuti.DG', 'Russia_Ust_Ishim.DG', 'Switzerland_Bichon.SG')
qpwave(example_f2_blocks, left, right)
#> ℹ Computing f4 stats...
#> ℹ Computing number of admixture waves...
#>
#> $f4
#> # A tibble: 3 × 8
#> pop1 pop2 pop3 pop4 est se z p
#> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Altai_Neanderthal.DG Vindija.DG Chimp.REF Mbuti… 1.24e-4 1.35e-4 0.920 0.358
#> 2 Altai_Neanderthal.DG Vindija.DG Chimp.REF Russi… 4.45e-4 1.64e-4 2.72 0.00653
#> 3 Altai_Neanderthal.DG Vindija.DG Chimp.REF Switz… 4.22e-4 1.72e-4 2.45 0.0144
#>
#> $rankdrop
#> # A tibble: 1 × 7
#> f4rank dof chisq p dofdiff chisqdiff p_nested
#> <int> <int> <dbl> <dbl> <int> <dbl> <dbl>
#> 1 0 3 11.9 0.00768 NA NA NA
#>