antiSMASH
antiSMASH allows the rapid genome-wide identification, annotation and analysis
of secondary metabolite biosynthesis gene clusters in bacterial and fungal
genomes. It integrates and cross-links with a large number of in silico
secondary metabolite analysis tools that have been published earlier.
To activate:
1
2
|
bash
source /local/cluster/antismash/activate.sh
|
To use in SGE:
Make a bash script and put the source /local/cluster/antismash/activate.sh
before your antismash commands, and submit the script using SGE_Batch
or
SGE_Array
.
Location and version:
1
2
3
4
|
$ which antismash
/local/cluster/antismash/bin/antismash
$ antismash --version
antiSMASH 6.0.1
|
help message:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
|
$ antismash --help-showall
########### antiSMASH 6.0.1 #############
usage: antismash [--taxon {bacteria,fungi}] [--output-dir OUTPUT_DIR]
[--output-basename OUTPUT_BASENAME] [--reuse-results PATH] [--limit LIMIT]
[--minlength MINLENGTH] [--start START] [--end END] [--databases PATH]
[--write-config-file PATH] [--without-fimo]
[--executable-paths EXECUTABLE=PATH,EXECUTABLE2=PATH2,...] [--allow-long-headers]
[-v] [-d] [--logfile PATH] [--list-plugins] [--check-prereqs]
[--limit-to-record RECORD_ID] [-V] [--profiling] [--skip-sanitisation]
[--skip-zip-file] [--minimal] [--enable-genefunctions] [--enable-tta]
[--enable-lanthipeptides] [--enable-thiopeptides] [--enable-nrps-pks]
[--enable-sactipeptides] [--enable-lassopeptides] [--enable-t2pks] [--enable-html]
[--genefinding-tool {glimmerhmm,prodigal,prodigal-m,none,error}]
[--genefinding-gff3 GFF3_FILE] [--hmmdetection-strictness {strict,relaxed,loose}]
[--fullhmmer] [--fullhmmer-pfamdb-version FULLHMMER_PFAMDB_VERSION] [--cassis]
[--clusterhmmer] [--clusterhmmer-pfamdb-version CLUSTERHMMER_PFAMDB_VERSION]
[--sideload JSON] [--sideload-simple ACCESSION:START-END] [--tigrfam]
[--smcog-trees] [--tta-threshold TTA_THRESHOLD] [--cb-general] [--cb-subclusters]
[--cb-knownclusters] [--cb-nclusters count] [--cb-min-homology-scale LIMIT] [--asf]
[--pfam2go] [--rre] [--rre-cutoff RRE_CUTOFF] [--rre-minlength RRE_MIN_LENGTH]
[--cc-mibig] [--cc-custom-dbs FILE1,FILE2,...] [--html-title HTML_TITLE]
[--html-description HTML_DESCRIPTION] [--html-start-compact] [-h] [--help-showall]
[-c CPUS]
[SEQUENCE [SEQUENCE ...]]
arguments:
SEQUENCE GenBank/EMBL/FASTA file(s) containing DNA.
--------
Options
--------
-h, --help Show this help text.
--help-showall Show full lists of arguments on this help text.
-c CPUS, --cpus CPUS How many CPUs to use in parallel. (default: 64)
Basic analysis options:
--taxon {bacteria,fungi}
Taxonomic classification of input sequence. (default: bacteria)
Additional analysis:
--fullhmmer Run a whole-genome HMMer analysis.
--cassis Motif based prediction of SM gene cluster regions.
--clusterhmmer Run a cluster-limited HMMer analysis.
--tigrfam Annotate clusters using TIGRFam profiles.
--smcog-trees Generate phylogenetic trees of sec. met. cluster orthologous groups.
--tta-threshold TTA_THRESHOLD
Lowest GC content to annotate TTA codons at (default: 0.65).
--cb-general Compare identified clusters against a database of antiSMASH-predicted
clusters.
--cb-subclusters Compare identified clusters against known subclusters responsible for
synthesising precursors.
--cb-knownclusters Compare identified clusters against known gene clusters from the MIBiG
database.
--asf Run active site finder analysis.
--pfam2go Run Pfam to Gene Ontology mapping module.
--rre Run RREFinder precision mode on all RiPP gene clusters.
--cc-mibig Run a comparison against the MIBiG dataset
Output options:
--output-dir OUTPUT_DIR
Directory to write results to.
--output-basename OUTPUT_BASENAME
Base filename to use for output files within the output directory.
--html-title HTML_TITLE
Custom title for the HTML output page (default is input filename).
--html-description HTML_DESCRIPTION
Custom description to add to the output.
--html-start-compact Use compact view by default for overview page.
Advanced options:
--reuse-results PATH Use the previous results from the specified json datafile
--limit LIMIT Only process the first <limit> records (default: -1). -1 to disable
--minlength MINLENGTH
Only process sequences larger than <minlength> (default: 1000).
--start START Start analysis at nucleotide specified.
--end END End analysis at nucleotide specified
--databases PATH Root directory of the databases (default:
/local/cluster/antismash-6.0.1/lib/python3.8/site-
packages/antismash/databases).
--write-config-file PATH
Write a config file to the supplied path
--without-fimo Run without FIMO (lowers accuracy of RiPP precursor predictions)
--executable-paths EXECUTABLE=PATH,EXECUTABLE2=PATH2,...
A comma separated list of executable name->path pairs to override any on the
system path.E.g. diamond=/alternate/path/to/diamond,hmmpfam2=hmm2pfam
--allow-long-headers Prevents long headers from being renamed
--hmmdetection-strictness {strict,relaxed,loose}
Defines which level of strictness to use for HMM-based cluster detection,
(default: relaxed).
--sideload JSON Sideload annotations from the JSON file in the given paths. Multiple files
can be provided, separated by a comma.
--sideload-simple ACCESSION:START-END
Sideload a single subregion in record ACCESSION from START to END. Positions
are expected to be 0-indexed, with START inclusive and END exclusive.
Debugging & Logging options:
-v, --verbose Print verbose status information to stderr.
-d, --debug Print debugging information to stderr.
--logfile PATH Also write logging output to a file.
--list-plugins List all available sec. met. detection modules.
--check-prereqs Just check if all prerequisites are met.
--limit-to-record RECORD_ID
Limit analysis to the record with ID record_id
-V, --version Display the version number and exit.
--profiling Generate a profiling report, disables multiprocess python.
--skip-sanitisation Skip input record sanitisation. Use with care.
--skip-zip-file Do not create a zip of the output
Debugging options for cluster-specific analyses:
--minimal Only run core detection modules, no analysis modules unless explicitly
enabled
--enable-genefunctions
Enable Gene function annotations (default: enabled, unless --minimal is
specified)
--enable-tta Enable TTA detection (default: enabled, unless --minimal is specified)
--enable-lanthipeptides
Enable Lanthipeptides (default: enabled, unless --minimal is specified)
--enable-thiopeptides
Enable Thiopeptides (default: enabled, unless --minimal is specified)
--enable-nrps-pks Enable NRPS/PKS analysis (default: enabled, unless --minimal is specified)
--enable-sactipeptides
Enable sactipeptide detection (default: enabled, unless --minimal is
specified)
--enable-lassopeptides
Enable lassopeptide precursor prediction (default: enabled, unless --minimal
is specified)
--enable-t2pks Enable type II PKS analysis (default: enabled, unless --minimal is specified)
--enable-html Enable HTML output (default: enabled, unless --minimal is specified)
Gene finding options (ignored when ORFs are annotated):
--genefinding-tool {glimmerhmm,prodigal,prodigal-m,none,error}
Specify algorithm used for gene finding: GlimmerHMM, Prodigal, Prodigal
Metagenomic/Anonymous mode, or none. The 'error' option will raise an error
if genefinding is attempted. The 'none' option will not run genefinding.
(default: error).
--genefinding-gff3 GFF3_FILE
Specify GFF3 file to extract features from.
Full HMMer options:
--fullhmmer-pfamdb-version FULLHMMER_PFAMDB_VERSION
PFAM database version number (e.g. 27.0) (default: latest).
Cluster HMMer options:
--clusterhmmer-pfamdb-version CLUSTERHMMER_PFAMDB_VERSION
PFAM database version number (e.g. 27.0) (default: latest).
TIGRFam options:
NRPS/PKS options:
ClusterBlast options:
--cb-nclusters count Number of clusters from ClusterBlast to display, cannot be greater than 50.
(default: 10)
--cb-min-homology-scale LIMIT
A minimum scaling factor for the query BGC in ClusterBlast results. Valid
range: 0.0 - 1.0. Warning: some homologous genes may no longer be visible!
(default: 0.0)
RREfinder options:
--rre-cutoff RRE_CUTOFF
Bitscore cutoff for RRE pHMM detection (default: 25.0).
--rre-minlength RRE_MIN_LENGTH
Minimum amino acid length of RRE domains (default: 50).
ClusterCompare options:
--cc-custom-dbs FILE1,FILE2,...
A comma separated list of database config files to run with
|
Usage:
Fast run
Running antismash without parameters will run the core detection modules and
all fast cluster-specific analysis steps. More time-consuming options such as
the ClusterBlast analyses, cluster-based PFAM annotations, smCoG tree
generation, etc. will not be run. On a quad-core machine, running the
Streptomyces coelicolor genome with these options will take about two minutes.
This is how the antiSMASH web service runs fast mode jobs from
https://fast.antismash.secondarymetabolites.org/
1
|
antismash streptomyces_coelicolor.gbk
|
Minimal run
Running antismash with the –minimal parameter will only run the core
detection modules, none of the cluster-specific analysis steps. On a quad-core
machine, running the Streptomyces coelicolor genome in minimal mode will take
about one minute. In general, we recommend running without the –minimal
option, as a default fast run will generate much more useful results for only
one additional minute of runtime.
1
|
antismash --minimal streptomyces_coelicolor.gbk
|
Full-featured run
On a quad core machine, running all these options for the Streptomyces
coelicolor genome will take a bit over 20 minutes.
1
|
antismash --cb-general --cb-knownclusters --cb-subclusters --asf --pfam2go --smcog-trees streptomyces_coelicolor.gbk
|
software ref: https://docs.antismash.secondarymetabolites.org
software ref: https://antismash.secondarymetabolites.org
research ref: https://doi.org/10.1093/nar/gkab335