# MEGAHIT 1.2.9 MEGAHIT ======= MEGAHIT is an ultra-fast and memory-efficient NGS assembler. It is optimized for metagenomes, but also works well on generic single genome assembly (small or mammalian size) and single-cell assembly. Usage ----- ### Basic usage ```sh megahit -1 pe_1.fq -2 pe_2.fq -o out # 1 paired-end library megahit --12 interleaved.fq -o out # one paired & interleaved paired-end library megahit -1 a1.fq,b1.fq,c1.fq -2 a2.fq,b2.fq,c2.fq -r se1.fq,se2.fq -o out # 3 paired-end libraries + 2 SE libraries megahit_core contig2fastg 119 out/intermediate_contigs/k119.contig.fa > k119.fastg # get FASTG from the intermediate contigs of k=119 ``` The contigs can be found `final.contigs.fa` in the output directory. ### Advanced usage * `--kmin-1pass`: if sequencing depth is low and too much memory used when build the graph of k_min * `--presets meta-large`: if the metagenome is complex (i.e., bio-diversity is high, for example soil metagenomes) * `--cleaning-rounds 1 --disconnect-ratio 0`: get less pruned assembly (usually shorter contigs) * `--continue -o out`: resume an interrupted job from `out` To see the full manual, run `megahit` without parameters or with `-h`. Also, our [wiki](https://github.com/voutcn/megahit/wiki) may be helpful. Location and version: ```console $ which megahit /local/cluster/bin/megahit $ megahit --version MEGAHIT v1.2.9 ``` help message: ```console $ megahit --help MEGAHIT v1.2.9 contact: Dinghua Li Usage: megahit [options] {-1 -2 | --12 | -r } [-o ] Input options that can be specified for multiple times (supporting plain text and gz/bz2 extensions) -1 comma-separated list of fasta/q paired-end #1 files, paired with filesin -2 comma-separated list of fasta/q paired-end #2 files, paired with filesin --12 comma-separated list of interleaved fasta/q paired-end files -r/--read comma-separated list of fasta/q single-end files Optional Arguments: Basic assembly options: --min-count minimum multiplicity for filtering (k_min+1)-mers [2] --k-list comma-separated list of kmer size all must be odd, in the range 15-255, increment <= 28) [21,29,39,59,79,99,119,141] Another way to set --k-list (overrides --k-list if one of them set): --k-min minimum kmer size (<= 255), must be odd number [21] --k-max maximum kmer size (<= 255), must be odd number [141] --k-step increment of kmer size of each iteration (<= 28), must be even number [12] Advanced assembly options: --no-mercy do not add mercy kmers --bubble-level intensity of bubble merging (0-2), 0 to disable [2] --merge-level merge complex bubbles of length <= l*kmer_size and similarity >= s [20,0.95] --prune-level strength of low depth pruning (0-3) [2] --prune-depth remove unitigs with avg kmer depth less than this value [2] --disconnect-ratio disconnect unitigs if its depth is less than this ratio times the total depth of itself and its siblings [0.1] --low-local-ratio remove unitigs if its depth is less than this ratio times the average depth of the neighborhoods [0.2] --max-tip-len remove tips less than this value [2*k] --cleaning-rounds number of rounds for graph cleanning [5] --no-local disable local assembly --kmin-1pass use 1pass mode to build SdBG of k_min Presets parameters: --presets override a group of parameters; possible values: meta-sensitive: '--min-count 1 --k-list 21,29,39,49,...,129,141' meta-large: '--k-min 27 --k-max 127 --k-step 10' (large & complex metagenomes, like soil) Hardware options: -m/--memory max memory in byte to be used in SdBG construction (if set between 0-1, fraction of the machine's total memory) [0.9] --mem-flag SdBG builder memory mode. 0: minimum; 1: moderate; others: use all memory specified by '-m/--memory' [1] -t/--num-cpu-threads number of CPU threads [# of logical processors] --no-hw-accel run MEGAHIT without BMI2 and POPCNT hardware instructions Output options: -o/--out-dir output directory [./megahit_out] --out-prefix output prefix (the contig file will be OUT_DIR/OUT_PREFIX.contigs.fa) --min-contig-len minimum length of contigs to output [200] --keep-tmp-files keep all temporary files --tmp-dir set temp directory Other Arguments: --continue continue a MEGAHIT run from its last available check point. please set the output directory correctly when using this option. --test run MEGAHIT on a toy test dataset -h/--help print the usage message -v/--version print version ``` software ref: research ref: