The MaSuRCA (Maryland Super Read Cabog Assembler) genome assembly and analysis
toolkit contains of MaSuRCA genome assembler, QuORUM error corrector for
Illumina data, POLCA genome polishing software, Chromosome scaffolder,
jellyfish mer counter, and MUMmer aligner. The usage instructions for the
additional tools that are exclusive to MaSuRCA, such as POLCA and Chromosome
scaffolder are provided at the end of this Guide.
The MaSuRCA assembler combines the benefits of deBruijn graph and
Overlap-Layout-Consensus assembly approaches. Since version 3.2.1 it supports
hybrid assembly with short Illumina reads and long high error PacBio/MinION
data.
Location and version:
1
2
3
4
|
$ which masurca
/local/cluster/MaSuRCA/bin/masurca
$ masurca -v
version 4.0.5
|
help message:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
|
$ masurca -h
Create the assembly script from a MaSuRCA configuration file. A
sample configuration file can be generated with the -g switch. The
assembly script assemble.sh will run the assembly proper. For a
quick run without creating a configuration file, and with two Illumina
paired end reads files (forward/reverse) and (optionally) a
long reads (Nanopore/PacBio) file use -i switch, setting the number of threads with -t:
masurca -r paired_ends_fwd.fastq.gz -t 32
or
masurca -r paired_ends_fwd.fastq.gz,paired_ends_rev.fastq.gz -t 32
,and for a hybrid assembly you can also add the long Nanopore or PacBio reads with -r switch:
masurca -r paired_ends_fwd.fastq.gz,paired_ends_rev.fastq.gz -r nanopore.fa.gz -t 32
this will run paired-ends Illumina or hybrid assembly with CABOG contigger and default settings.
This is suitable for small assembly projects.
Options:
-t, --threads ONLY to use with -i option, number of threads
-i, --illumina Run assembly without creating configuration file, argument can be
illumina_paired_end_forward_reads
or
illumina_paired_end_forward_reads,illumina_paired_end_reverse_reads
if you only have single-end Illumina data. Reads file(s) could be fasta or fastq, can be gzipped.
-r, --reads ONLY to use with -i option, single long reads file for hybrid assembly, canbe Nanopore or PacBio,
fasta or fastq, can be gzipped
-v, --version Report version
-o, --output Assembly script (assemble.sh)
-g, --generate Generate example configuration file
-p, --path Prepend to PATH in assembly script
-l, --ld-library-path Prepend to LD_LIBRARY_PATH in assembly script
--skip-checking Skip checking availability of other executables
-h, --help This message
|
software ref: https://github.com/alekseyzimin/masurca
research ref: https://doi.org/10.1093/bioinformatics/btt476
research ref: https://doi.org/10.1101/gr.213405.116