Software - msms

MSMS is a coalescent simulator that models itself off Hudsons ms in usage and includes selection. It is fast, often faster than ms, and portable running on Mac OSX, windows and Linux. By using this tool, one can study the patterns of selection in complicated demographic scenarios.

The easiest way to get started is by simply downloading one of the full prepackaged installers and try it. This includes the manual with examples to help get you stated. Generally this program is command line compatible with ms, so if you know ms you will find msms easy to use.

It should be noted to those who are not familiar with ms, that msms is a command line only program. There is no GUI, and you cannot use a mouse to set up simulations. The command line may look intimidating, but in reality it is quite easy to build up very complicated models if need be. The trick is to build the model up one step at a time.

The program is written in the popular java programming language and the source is freely available under a GPL 3 license from GitHub. It was written with the intention of ease of extensibility, so there is a significant amount of developer documentation as well as normal user documentation. One can also make use of java's dynamic loading abilities so extensions or modules need not be compiled into the core version.

How it works

When simulating a normal coalescent without selection, this program works much like any other coalescent simulation tool. It keeps track of lineages and goes back in time generation coalescent, migration and recombination events with the appropriate probability distributions.

However, when selection is specified, first a forward simulation is run to generate a frequency history of the beneficial allele. This step assumes HWE in the diploid case, and includes genetic drift. The result is a history of frequency in each deme of all considered alleles. Now when the simulation performs the normal coalescent, the probabilities of the different events are now conditioned on the allele frequencies. This can be somewhat tricky because the frequency varies in a stochastic way over time. However, in many cases performance close to that of a plain coalescent simulation is often possible.

Citing

Please use the following reference when citing msms:

Ewing G. and Hermisson J. (2010). MSMS: A coalescent simulation program including recombination, demographic structure, and selection at a single locus. Bioinformatics 26: 2064-2065.(doi: 10.1093/bioinformatics/btq322)

Contributing

One of the design goals of msms was to have a code base that can be easily extensible. Contributions means getting familiar with the code, and a good place to start is with the developers documentation. This gives more details about how the code works and how it fits together.

The primary source management tool we use is git, a distributed version control system. This has a number of features that permits easy forking and merging of a project like this. For example, if you want to develop some extensions, you probably want to work on it, get some results and then publish some kind of application note. With git you can keep up to date easily, and once you have the publication ready, you only need to let me know. I can review the changes, and merge the patch incorporating the extension into the core program.

Alternatively, you can use Java's late linking behavior to create an extension that does not even need to be included into the core, but by simply specifying some special command line options.

GitHub

You can find further information on msms on the github project page and its issues page.