qpwave compares two sets of populations (left and right) to each other. It estimates a lower bound on the number of admixtue waves that went from left into right, by comparing a matrix of f4-statistics to low-rank approximations. For a rank of 0 this is equivalent to testing whether left and right form clades relative to each other.

qpwave(
  data,
  left,
  right,
  fudge = 1e-04,
  boot = FALSE,
  constrained = FALSE,
  cpp = TRUE,
  verbose = TRUE
)

Arguments

data

The input data in the form of:

  • A 3d array of blocked f2 statistics, output of f2_from_precomp or extract_f2

  • A directory with f2 statistics

  • The prefix of a genotype file

left

Left populations (sources)

right

Right populations (outgroups)

fudge

Value added to diagonal matrix elements before inverting

boot

If FALSE (the default), block-jackknife resampling will be used to compute standard errors. Otherwise, block-bootstrap resampling will be used to compute standard errors. If boot is an integer, that number will specify the number of bootstrap resamplings. If boot = TRUE, the number of bootstrap resamplings will be equal to the number of SNP blocks.

constrained

Constrain admixture weights to be non-negative

cpp

Use C++ functions. Setting this to FALSE will be slower but can help with debugging.

verbose

Print progress updates

Value

qpwave returns a list with up to two data frames describing the model fit:

  1. f4 A data frame with estimated f4-statistics

  2. rankdrop: A data frame describing model fits with different ranks, including p-values for the overall fit and for nested models (comparing two models with rank difference of one). A model with L left populations and R right populations has an f4-matrix of dimensions (L-1)*(R-1). If no two left population form a clade with respect to all right populations, this model will have rank (L-1)*(R-1).

    • f4rank: Tested rank

    • dof: Degrees of freedom of the chi-squared null distribution: (L-1-f4rank)*(R-1-f4rank)

    • chisq: Chi-sqaured statistic, obtained as E'QE, where E is the difference between estimated and fitted f4-statistics, and Q is the f4-statistic covariance matrix.

    • p: p-value obtained from chisq as pchisq(chisq, df = dof, lower.tail = FALSE)

    • dofdiff: Difference in degrees of freedom between this model and the model with one less rank

    • chisqdiff: Difference in chi-squared statistics

    • p_nested: p-value testing whether the difference between two models of rank difference 1 is significant

References

Patterson, N. et al. (2012) Ancient admixture in human history. Genetics

Haak, W. et al. (2015) Massive migration from the steppe was a source for Indo-European languages in Europe. Nature (SI 10)

See also

Examples

left = c('Altai_Neanderthal.DG', 'Vindija.DG')
right = c('Chimp.REF', 'Mbuti.DG', 'Russia_Ust_Ishim.DG', 'Switzerland_Bichon.SG')
qpwave(example_f2_blocks, left, right)
#> ℹ Computing f4 stats...
#> ℹ Computing number of admixture waves...
#> 
#> $f4
#> # A tibble: 3 × 8
#>   pop1                 pop2       pop3      pop4       est      se     z       p
#>   <chr>                <chr>      <chr>     <chr>    <dbl>   <dbl> <dbl>   <dbl>
#> 1 Altai_Neanderthal.DG Vindija.DG Chimp.REF Mbuti… 1.24e-4 1.35e-4 0.920 0.358  
#> 2 Altai_Neanderthal.DG Vindija.DG Chimp.REF Russi… 4.45e-4 1.64e-4 2.72  0.00653
#> 3 Altai_Neanderthal.DG Vindija.DG Chimp.REF Switz… 4.22e-4 1.72e-4 2.45  0.0144 
#> 
#> $rankdrop
#> # A tibble: 1 × 7
#>   f4rank   dof chisq       p dofdiff chisqdiff p_nested
#>    <int> <int> <dbl>   <dbl>   <int>     <dbl>    <dbl>
#> 1      0     3  11.9 0.00768      NA        NA       NA
#>