MaSuRCA 4.0.5

MaSuRCA Genome Assembly and Analysis Toolkit Quick Start Guide

The MaSuRCA (Maryland Super Read Cabog Assembler) genome assembly and analysis toolkit contains of MaSuRCA genome assembler, QuORUM error corrector for Illumina data, POLCA genome polishing software, Chromosome scaffolder, jellyfish mer counter, and MUMmer aligner. The usage instructions for the additional tools that are exclusive to MaSuRCA, such as POLCA and Chromosome scaffolder are provided at the end of this Guide.

The MaSuRCA assembler combines the benefits of deBruijn graph and Overlap-Layout-Consensus assembly approaches. Since version 3.2.1 it supports hybrid assembly with short Illumina reads and long high error PacBio/MinION data.

Location and version:

1
2
3
4
$ which masurca
/local/cluster/MaSuRCA/bin/masurca
$ masurca -v
version 4.0.5

help message:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
$ masurca -h
Create the assembly script from a MaSuRCA configuration file. A
sample configuration file can be generated with the -g switch. The
assembly script assemble.sh will run the assembly proper. For a
quick run without creating a configuration file, and with two Illumina
paired end reads files (forward/reverse) and (optionally) a
long reads (Nanopore/PacBio) file use -i switch, setting the number of threads with -t:

masurca -r paired_ends_fwd.fastq.gz -t 32
or
masurca -r paired_ends_fwd.fastq.gz,paired_ends_rev.fastq.gz -t 32

,and for a hybrid assembly you can also add the long Nanopore or PacBio reads with -r switch:

masurca -r paired_ends_fwd.fastq.gz,paired_ends_rev.fastq.gz -r nanopore.fa.gz -t 32

this will run paired-ends Illumina or hybrid assembly with CABOG contigger and default settings.
This is suitable for small assembly projects.

Options:
 -t, --threads             ONLY to use with -i option, number of threads
 -i, --illumina            Run assembly without creating configuration file, argument can be
                              illumina_paired_end_forward_reads
                                or
                              illumina_paired_end_forward_reads,illumina_paired_end_reverse_reads
                           if you only have single-end Illumina data. Reads file(s) could be fasta or fastq, can be gzipped.
 -r, --reads               ONLY to use with -i option, single long reads file for hybrid assembly, canbe Nanopore or PacBio,
                           fasta or fastq, can be gzipped

 -v, --version             Report version
 -o, --output              Assembly script (assemble.sh)
 -g, --generate            Generate example configuration file
 -p, --path                Prepend to PATH in assembly script
 -l, --ld-library-path     Prepend to LD_LIBRARY_PATH in assembly script
     --skip-checking       Skip checking availability of other executables
 -h, --help                This message

software ref: https://github.com/alekseyzimin/masurca
research ref: https://doi.org/10.1093/bioinformatics/btt476
research ref: https://doi.org/10.1101/gr.213405.116