This is basically a wrapper function around the msprime_genome that allows user to create a random graph and simulate it in msprime v1.x.
random_sim(
nleaf,
nadmix,
outpref = "random_sim",
max_depth = NULL,
ind_per_pop = 1,
mutation_rate = 1.25e-08,
admix_weights = 0.5,
neff = 1000,
time = 1000,
fix_leaf = FALSE,
outpop = NULL,
nchr = 1,
recomb_rate = 2e-08,
seq_length = 1000,
ghost_lineages = TRUE,
run = FALSE
)The number of leaf nodes
The number of admixture events
A prefix of output files
A constraint specifying the maximum time depth of the admixture graph (in generations)
The number of individuals to simulate for each population. If a scalar value, it will be constant across all populations. Alternatively, it can be a named vector with a different value for each population.
Mutation rate per site per generation. The default is 1.25e-8 per base pair per generation.
Admixture weights. If a float value (0 < value < 1), admixture weights for each admixture event will be (value, 1-value).
Alternatively, it can be a range, i.e., c(0.1, 0.4) specifying lower and upper limits of a uniform distribution from which the admixture weight value will be drawn. By default, all admixture edges have a weight of 0.5.
Effective population size (in diploid individuals). If a scalar value, it will be constant across all populations. Alternatively, it can be a range, i.e., c(500, 1000) specifying lower and upper limits of an uniform distribution from which values will be drawn
Time between nodes. Either a scalar value (1000 by default) with the dates generated by pseudo_dates, or a range, i.e., c(500, 1000) specifying lower and upper limits of a uniform distribution from which values will be drawn (see random_dates)
A boolean specifying if the dates of the leaf nodes will be fixed at time 0. If TRUE, all samples will be drawn at the end of the simulation (i.e., from “today”).
A name of the (optional) outgroup population.
The number of chromosomes to simulate
A float value specifying recombination rate along the chromosomes. The default is 2e-8 per base pair per generation.
The sequence length of the chromosomes. If it is a scalar value, the sequence length will be constant for all chromosomes.
Alternatively, it can be a vector with a length equal to number of chromosomes (i.e., c(100,50) to simulate 2 chromosomes with the lengths of 100 and 50 base pairs).
A boolean value specifying whether ghost lineages will be allowed.
If TRUE, admixture happens at the time points defined by the y-axis generated while plotting the graph by plot_graph
If FALSE (default), admixture occurs at the time of the previous split event
If FALSE, the function will terminate after writing the msprime script. If TRUE, it will try and execute the function with the default python installation.
If you want to use some other python installation, you can set run = /my/python.
A list with the path of simulation script, a data frame of graph edges, dates and population sizes:
out - A file name and path of the simulation script
edges - An edge dataframe with admixture weights
dates - A named vector with a date for each node
neffs - A named vector with an effective population size for each node
# Create simulation script that simulates 2 chromosomes that are 50base long
# where maximum depth of the tree is 5000 generations, and plot the output graph
if (FALSE) {
out = random_sim(nleaf=4, nadmix=0, max_depth=5000, nchr=2, seq_length=50)
plot_graph(out$edges, dates = out$dates, neff = out$neff, hide_weights = TRUE)
}