Read allele frequencies from PLINK files

plink_to_afs(
  pref,
  inds = NULL,
  pops = NULL,
  adjust_pseudohaploid = TRUE,
  first = 1,
  last = NULL,
  numblocks = 1,
  poly_only = FALSE,
  verbose = TRUE
)

Arguments

pref

prefix of PLINK files (files have to end in .bed, .bim, .fam).

inds

Individuals from which to compute allele frequencies

pops

Populations from which to compute allele frequencies. If NULL (default), populations will be extracted from the third column in the .ind file. If population labels are provided, they should have the same length as inds, and will be matched to them by position

adjust_pseudohaploid

Genotypes of pseudohaploid samples are usually coded as 0 or 2, even though only one allele is observed. adjust_pseudohaploid ensures that the observed allele count increases only by 1 for each pseudohaploid sample. If TRUE (default), samples that don't have any genotypes coded as 1 among the first 1000 SNPs are automatically identified as pseudohaploid. This leads to slightly more accurate estimates of f-statistics. Setting this parameter to FALSE is equivalent to the ADMIXTOOLS inbreed: NO option. Setting adjust_pseudohaploid to an integer n will check the first n SNPs instead of the first 1000 SNPs.

numblocks

Number of blocks in which to read genotype file. Setting this to a number greater than one is more memory efficient, but slower.

poly_only

Only keep SNPs with mean allele frequency not equal to 0 or 1 (default FALSE).

verbose

Print progress updates

Value

A list with three items: Allele frequency matrix, allele count matrix, and SNP meta data.

Examples

if (FALSE) {
afdat = plink_to_afs(prefix, pops)
afs = afdat$afs
counts = afdat$counts
}