1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
|
$ ribodetector_cpu --help
usage: ribodetector_cpu [-h] [-c CONFIG] -l LEN -i [INPUT [INPUT ...]] -o [OUTPUT [OUTPUT ...]]
[-r [RRNA [RRNA ...]]] [-e {rrna,norrna,both,none}] [-t THREADS]
[--chunk_size CHUNK_SIZE] [--log LOG] [-v]
rRNA sequence detector
optional arguments:
-h, --help show this help message and exit
-c CONFIG, --config CONFIG
Path of config file
-l LEN, --len LEN Sequencing read length. Note: the accuracy reduces for reads shorter than 40.
-i [INPUT [INPUT ...]], --input [INPUT [INPUT ...]]
Path of input sequence files (fasta and fastq), the second file will be considered as second end if two files given.
-o [OUTPUT [OUTPUT ...]], --output [OUTPUT [OUTPUT ...]]
Path of the output sequence files after rRNAs removal (same number of files as input).
(Note: 2 times slower to write gz files)
-r [RRNA [RRNA ...]], --rrna [RRNA [RRNA ...]]
Path of the output sequence file of detected rRNAs (same number of files as input)
-e {rrna,norrna,both,none}, --ensure {rrna,norrna,both,none}
Ensure which classificaion has high confidence for paired end reads.
norrna: output only high confident non-rRNAs, the rest are clasified as rRNAs;
rrna: vice versa, only high confident rRNAs are classified as rRNA and the rest output as non-rRNAs;
both: both non-rRNA and rRNA prediction with high confidence;
none: give label based on the mean probability of read pair.
(Only applicable for paired end reads, discard the read pair when their predicitons are discordant)
-t THREADS, --threads THREADS
Number of threads to use. (default: 20)
--chunk_size CHUNK_SIZE
chunk_size * 1024 reads to load each time.
When chunk_size=1000 and threads=20, consumming ~20G memory, better to be multiples of the number of threads..
--log LOG Log file name
-v, --version show program's version number and exit
# davised:Linux @ chrom1 in ~ [11:08:30]
$ ribodetector --help
usage: ribodetector [-h] [-c CONFIG] [-d DEVICEID] -l LEN -i
[INPUT [INPUT ...]] -o [OUTPUT [OUTPUT ...]]
[-r [RRNA [RRNA ...]]]
[-e {rrna,norrna,both,none}] [-t THREADS]
[-m MEMORY] [--chunk_size CHUNK_SIZE] [--log LOG]
[-v]
rRNA sequence detector
optional arguments:
-h, --help show this help message and exit
-c CONFIG, --config CONFIG
Path of config file
-d DEVICEID, --deviceid DEVICEID
Indices of GPUs to enable. Quotated comma-separated device ID numbers. (default: all)
-l LEN, --len LEN Sequencing read length. Note: the accuracy reduces for reads shorter than 40.
-i [INPUT [INPUT ...]], --input [INPUT [INPUT ...]]
Path of input sequence files (fasta and fastq), the second file will be considered as second end if two files given.
-o [OUTPUT [OUTPUT ...]], --output [OUTPUT [OUTPUT ...]]
Path of the output sequence files after rRNAs removal (same number of files as input).
(Note: 2 times slower to write gz files)
-r [RRNA [RRNA ...]], --rrna [RRNA [RRNA ...]]
Path of the output sequence file of detected rRNAs (same number of files as input)
-e {rrna,norrna,both,none}, --ensure {rrna,norrna,both,none}
Ensure which classificaion has high confidence for paired end reads.
norrna: output only high confident non-rRNAs, the rest are clasified as rRNAs;
rrna: vice versa, only high confident rRNAs are classified as rRNA and the rest output as non-rRNAs;
both: both non-rRNA and rRNA prediction with high confidence;
none: give label based on the mean probability of read pair.
(Only applicable for paired end reads, discard the read pair when their predicitons are discordant)
-t THREADS, --threads THREADS
Number of threads to use. (default: 10)
-m MEMORY, --memory MEMORY
Amount (GB) of GPU RAM. (default: 12)
--chunk_size CHUNK_SIZE
Use this parameter when having low memory. Parsing the file in chunks.
Not needed when free RAM >=5 * your_file_size (uncompressed, sum of paired ends).
When chunk_size=256, memory=16 it will load 256 * 16 * 1024 reads each chunk (use ~20 GBfor 100bp paired end).
--log LOG Log file name
-v, --version show program's version number and exit
ribodetector --help 4.87s user 6.13s system 249% cpu 4.409 total
$ ribodetector_cpu --help
usage: ribodetector_cpu [-h] [-c CONFIG] -l LEN -i
[INPUT [INPUT ...]] -o [OUTPUT [OUTPUT ...]]
[-r [RRNA [RRNA ...]]]
[-e {rrna,norrna,both,none}] [-t THREADS]
[--chunk_size CHUNK_SIZE] [--log LOG] [-v]
rRNA sequence detector
optional arguments:
-h, --help show this help message and exit
-c CONFIG, --config CONFIG
Path of config file
-l LEN, --len LEN Sequencing read length. Note: the accuracy reduces for reads shorter than 40.
-i [INPUT [INPUT ...]], --input [INPUT [INPUT ...]]
Path of input sequence files (fasta and fastq), the second file will be considered as second end if two files given.
-o [OUTPUT [OUTPUT ...]], --output [OUTPUT [OUTPUT ...]]
Path of the output sequence files after rRNAs removal (same number of files as input).
(Note: 2 times slower to write gz files)
-r [RRNA [RRNA ...]], --rrna [RRNA [RRNA ...]]
Path of the output sequence file of detected rRNAs (same number of files as input)
-e {rrna,norrna,both,none}, --ensure {rrna,norrna,both,none}
Ensure which classificaion has high confidence for paired end reads.
norrna: output only high confident non-rRNAs, the rest are clasified as rRNAs;
rrna: vice versa, only high confident rRNAs are classified as rRNA and the rest output as non-rRNAs;
both: both non-rRNA and rRNA prediction with high confidence;
none: give label based on the mean probability of read pair.
(Only applicable for paired end reads, discard the read pair when their predicitons are discordant)
-t THREADS, --threads THREADS
Number of threads to use. (default: 20)
--chunk_size CHUNK_SIZE
chunk_size * 1024 reads to load each time.
When chunk_size=1000 and threads=20, consumming ~20G memory, better to be multiples of the number of threads..
--log LOG Log file name
-v, --version show program's version number and exit
|