MUMmer
MUMmer is a system for rapidly aligning DNA and protein sequences. The nucmer
aligner in the current version (release 4.x) can align two mammalian genomes
in about 3 hours on a typical 32+ core workstation with 64+Gb RAM; smaller
genomes such as bacteria or small eukaryotes are aligned in seconds or
minutes. The promer utility generates alignments based upon the six-frame
translations of both input sequences. promer permits the alignment of genomes
for which the proteins are similar but the DNA sequence is too divergent to
detect similarity. See the nucmer and promer readme files in the “docs/”
subdirectory for more details. MUMmer is open source, and we ask that you
cite our most recent paper in any publications that use this system:
(The latest Version 4.x citation)
Marçais G, Delcher AL, Phillippy AM, Coston R, Salzberg SL, Zimin A. MUMmer4: A fast and versatile genome alignment system. PLoS computational biology. 2018 Jan 26;14(1):e1005944.
(Version 3.x citation)
Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, Salzberg SL. Versatile and open software for comparing large genomes. Genome biology. 2004 Jan 1;5(2):R12.
(Version 2.1 citation)
Delcher AL, Phillippy A, Carlton J, Salzberg SL. Fast algorithms for large-scale genome alignment and comparison. Nucleic acids research. 2002 Jun 1;30(11):2478-83.
(Version 1.0 citation)
Delcher AL, Kasif S, Fleischmann RD, Peterson J, White O, Salzberg SL. Alignment of whole genomes. Nucleic acids research. 1999 Jan 1;27(11):2369-76.
Note
This software is a significant departure from mummer version 3.
Find the version 3 executable at /local/cluster/MUMmer3.23/
Location and version
1
2
3
4
|
$ which mummer
/local/cluster/mummer/bin/mummer
$ mummer --version
4.0.0rc1
|
help message
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
|
$ mummer --help
mummer: unrecognized option '--help'
Invalid parameters.
Usage: mummer [options] <reference-file> <query file1> . . . [query file32]
Implemented MUMmer v3 options:
-mum compute maximal matches that are unique in both sequences
-mumreference compute maximal matches that are unique in the reference-
sequence but not necessarily in the query-sequence (default)
-mumcand same as -mumreference
-maxmatch compute all maximal matches regardless of their uniqueness
-l set the minimum length of a match
if not set, the default value is 20
-b compute forward and reverse complement matches
-F force 4 column output format regardless of the number of
reference sequence inputs
-n match only the characters a, c, g, or t
-L print length of query sequence in header of matches
-r compute only reverse complement matches
-s print first 53 characters of the matching substring
-c Report the query position of a reverse complement match relative to the forward strand of the querysequence
Additional options:
-k sampled suffix positions (one by default)
-threads number of threads to use for -maxmatch, only valid k > 1
-qthreads number of threads to use for queries
-suflink use suffix links (1=yes or 0=no) in the index and during search [auto]
-child use child table (1=yes or 0=no) in the index and during search [auto]
-skip sparsify the MEM-finding algorithm even more, performing jumps of skip*k [auto (l-10)/k]
this is a performance parameter that trade-offs SA traversal with checking of right-maximal MEMs
-kmer use kmer table containing sa-intervals (speeds up searching first k characters) in the index and during search [int value, auto]
-save (string) save index to file to use again later (string)
-load (string) load index from file
Example usage:
./mummer -maxmatch -l 20 -b -n -k 3 -threads 3 ref.fa query.fa
Find all maximal matches on forward and reverse strands
of length 20 or greater, matching only a, c, t, or g.
Index every 3rd position in the ref.fa and use 3 threads to find MEMs.
Fastest method for one long query sequence.
./mummer -maxmatch -l 20 -b -n -k 3 -qthreads 3 ref.fa query.fa
Same as above, but now use a single thread for every query sequence in
query.fa. Fastest for many small query sequences.
|
software ref: https://github.com/mummer4/mummer
research ref: https://doi.org/10.1371/journal.pcbi.1005944