w2rap contigger 0.1

w2rap-contigger

An Illumina PE genome contig assembler, can handle large (17Gbp) complex (hexaploid) genomes.

Depends on the KAT analysis program.

To activate:

1
2
bash
source /local/cluster/kat/activate.sh

Location and version:

1
2
3
4
5
6
7
8
$ which w2rap-contigger
/local/cluster/kat/bin/w2rap-contigger
$ w2rap-contigger --version
w2rap-contigger --version

Welcome to w2rap-contigger

w2rap-contigger  version: 0.1

help message:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
$ w2rap-contigger --help
w2rap-contigger --help

Welcome to w2rap-contigger

USAGE:

   w2rap-contigger  [-h] [--version] [-t <int>] [-m <int>] -r <file1.fastq
                    ,file2.fastq> -o <string> -p <string> [-K <60|64|72|80
                    |84|88|96|100|108|116|128|136|144|152|160|168|172|180
                    |188|192|196|200|208|216|224|232|240|260|280|300|320
                    |368|400|440|460|500|544|640>] [--from_step <1|2|3|4|5
                    |6|7>] [--to_step <1|2|3|4|5|6|7>] [-d <int>]
                    [--tmp_dir <string>] [-s <int>] [--min_freq <int>]
                    [--min_qual <int>] [--pair_sample <int>]
                    [--extend_paths <bool>] [--path_finder <bool>]
                    [--dump_all <bool>] [--dump_perf <bool>] [--dump_pf
                    <bool>] [--dev_run_test <devel only>]


Where:

   -h,  --help
     Displays usage information and exits.

   --version
     Displays version information and exits.

   -t <int>,  --threads <int>
     Number of threads on parallel sections (default: 4)

   -m <int>,  --max_mem <int>
     Maximum memory in GB (soft limit, impacts performance, default 10000)

   -r <file1.fastq,file2.fastq>,  --read_files <file1.fastq,file2.fastq>
     (required)  Input sequences (reads) files

   -o <string>,  --out_dir <string>
     (required)  Output dir path

   -p <string>,  --prefix <string>
     (required)  Prefix for the output files

   -K <60|64|72|80|84|88|96|100|108|116|128|136|144|152|160|168|172|180|188
      |192|196|200|208|216|224|232|240|260|280|300|320|368|400|440|460|500
      |544|640>,  --large_k <60|64|72|80|84|88|96|100|108|116|128|136|144
      |152|160|168|172|180|188|192|196|200|208|216|224|232|240|260|280|300
      |320|368|400|440|460|500|544|640>
     Large k (default: 200)

   --from_step <1|2|3|4|5|6|7>
     Start on step (default: 1)

   --to_step <1|2|3|4|5|6|7>
     Stop after step (default: 7)

   -d <int>,  --disk_batches <int>
     number of disk batches for step2 (default: 0, 0->in memory)

   --tmp_dir <string>
     tmp dir for step2 disk batches (default: workdir)

   -s <int>,  --min_size <int>
     Min size of disconnected elements on large_k graph (in kmers, default:
     0=no min)

   --min_freq <int>
     minimum frequency for small k-mers on step 2 (default: 4)

   --min_qual <int>
     minimum quality for small k-mers on step 2 (default: 7)

   --pair_sample <int>
     max number of read pairs to use in local assemblies on step 5(default:
     200)

   --extend_paths <bool>
     Enable extend paths on repath (experimental)

   --path_finder <bool>
     Run PathFinder (experimental)

   --dump_all <bool>
     Dump all intermediate files

   --dump_perf <bool>
     Dump performance info (devel)

   --dump_pf <bool>
     Dump pathfinder info (devel)

   --dev_run_test <devel only>
     runs development tests

software ref: https://github.com/bioinfologics/w2rap-contigger
research ref: