This is basically a wrapper function around the msprime_genome
that allows user to create a random graph and simulate it in msprime v1.x.
random_sim(
nleaf,
nadmix,
outpref = "random_sim",
max_depth = NULL,
ind_per_pop = 1,
mutation_rate = 1.25e-08,
admix_weights = 0.5,
neff = 1000,
time = 1000,
fix_leaf = FALSE,
outpop = NULL,
nchr = 1,
recomb_rate = 2e-08,
seq_length = 1000,
ghost_lineages = TRUE,
run = FALSE
)
The number of leaf nodes
The number of admixture events
A prefix of output files
A constraint specifying the maximum time depth of the admixture graph (in generations)
The number of individuals to simulate for each population. If a scalar value, it will be constant across all populations. Alternatively, it can be a named vector with a different value for each population.
Mutation rate per site per generation. The default is 1.25e-8
per base pair per generation.
Admixture weights. If a float value (0 < value < 1), admixture weights for each admixture event will be (value, 1-value).
Alternatively, it can be a range, i.e., c(0.1, 0.4)
specifying lower and upper limits of a uniform distribution from which the admixture weight value will be drawn. By default, all admixture edges have a weight of 0.5.
Effective population size (in diploid individuals). If a scalar value, it will be constant across all populations. Alternatively, it can be a range, i.e., c(500, 1000)
specifying lower and upper limits of an uniform distribution from which values will be drawn
Time between nodes. Either a scalar value (1000 by default) with the dates generated by pseudo_dates
, or a range, i.e., c(500, 1000)
specifying lower and upper limits of a uniform distribution from which values will be drawn (see random_dates
)
A boolean specifying if the dates of the leaf nodes will be fixed at time 0. If TRUE
, all samples will be drawn at the end of the simulation (i.e., from “today”).
A name of the (optional) outgroup population.
The number of chromosomes to simulate
A float value specifying recombination rate along the chromosomes. The default is 2e-8
per base pair per generation.
The sequence length of the chromosomes. If it is a scalar value, the sequence length will be constant for all chromosomes.
Alternatively, it can be a vector with a length equal to number of chromosomes (i.e., c(100,50)
to simulate 2 chromosomes with the lengths of 100 and 50 base pairs).
A boolean value specifying whether ghost lineages will be allowed.
If TRUE
, admixture happens at the time points defined by the y-axis generated while plotting the graph by plot_graph
If FALSE
(default), admixture occurs at the time of the previous split event
If FALSE
, the function will terminate after writing the msprime script. If TRUE
, it will try and execute the function with the default python installation.
If you want to use some other python installation, you can set run = /my/python
.
A list with the path of simulation script, a data frame of graph edges, dates and population sizes:
out - A file name and path of the simulation script
edges - An edge dataframe with admixture weights
dates - A named vector with a date for each node
neffs - A named vector with an effective population size for each node
# Create simulation script that simulates 2 chromosomes that are 50base long
# where maximum depth of the tree is 5000 generations, and plot the output graph
if (FALSE) {
out = random_sim(nleaf=4, nadmix=0, max_depth=5000, nchr=2, seq_length=50)
plot_graph(out$edges, dates = out$dates, neff = out$neff, hide_weights = TRUE)
}