Quick Start
It will be assumed that msms is in the path. Thus, the
command line starts with msms. If however this is not the case, you may
need to prefix the command line with other java options and commands. We
further assume no familiarity with ms. This page is only to get you
started, please refer to the manual
for more complete documentation, or the
cheat sheet, which is also in a plain
html version here.
We will start with a simple example and then explain each option
in turn. Here, we consider a single diploid population.
msms -N 10000 -ms 10 1000 -t 1
This command tells
msms to use an effective population size of
10000 with the
-N option. This option is unique to msms and is
important even when not considering selection. For now, its important to
use a large number. While selection is not included in this parameter,
it does not affect run times in any way.
The -ms 10 1000 option is the same as the first two
options to ms. The first is the number of samples, the second is the
number of replicates. After this option, all the normal options of ms
can be used and has the same meanings as per ms.
The last option is -t 1 and specifies the theta
parameter. We typically assume a diploid population so theta is
4*N*mutation rate. All parameters are scaled with N in some way.
Adding Selection
We consider the case of a selective sweep. In this case, we have
a time invariant model, that is the parameters and structure of the
demographics don't change over time.
msms -N 10000 -ms 10 1000 -t 1 -SAA 100 -Sa 50 -SF
0
The
-SAA option sets the selection strength for homozygotes and
the
-SaA sets the strength for heterozygotes. Note that because
we use the
-SF option the heterozygotes selection strength must
be lower than the homozygotes, otherwise the forward simulation will
never go to fixation and simply run forever. The selection strength is
again scaled with 4Ns. The
-SF 0 option tells the selection
simulation to run until fixation and that fixation time is 0 time units
into the past. Time is in 4N generations.
It is important to note that in order to use the -SF
option that the forward simulation will go to fixation and that the
model is time invariant. The program tries to detect when these
conditions are not met, but this cannot be guaranteed.
Recombination
msms supports two models of recombination. An infinite sites
model and a finite sites model where you can define number of sites.
Regardless of what recombination model is used, the mutation model is by
default always an infinite sites model.
msms -N 10000 -ms 10 1000 -t 1 -r 10 1000
In this example we use the finite sites model of recombination by
giving the -r option 2 arguments. The first is the
recombination rate scaled with 4N. The second is the number of "sites"
per unit of sequence. If we want to use an infinite sites recombination
we simply omit the last argument.