antiSMASH 6.0.1

2021-11-09 1308 words 7 minutes

antiSMASH

antiSMASH allows the rapid genome-wide identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genomes. It integrates and cross-links with a large number of in silico secondary metabolite analysis tools that have been published earlier.

To activate:

1
2


bash
source /local/cluster/antismash/activate.sh

To use in SGE:

Make a bash script and put the source /local/cluster/antismash/activate.sh before your antismash commands, and submit the script using SGE_Batch or SGE_Array.

Location and version:

1
2
3
4


$ which antismash
/local/cluster/antismash/bin/antismash
$ antismash --version
antiSMASH 6.0.1

help message:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183


$ antismash --help-showall

########### antiSMASH 6.0.1 #############

usage: antismash [--taxon {bacteria,fungi}] [--output-dir OUTPUT_DIR]
                 [--output-basename OUTPUT_BASENAME] [--reuse-results PATH] [--limit LIMIT]
                 [--minlength MINLENGTH] [--start START] [--end END] [--databases PATH]
                 [--write-config-file PATH] [--without-fimo]
                 [--executable-paths EXECUTABLE=PATH,EXECUTABLE2=PATH2,...] [--allow-long-headers]
                 [-v] [-d] [--logfile PATH] [--list-plugins] [--check-prereqs]
                 [--limit-to-record RECORD_ID] [-V] [--profiling] [--skip-sanitisation]
                 [--skip-zip-file] [--minimal] [--enable-genefunctions] [--enable-tta]
                 [--enable-lanthipeptides] [--enable-thiopeptides] [--enable-nrps-pks]
                 [--enable-sactipeptides] [--enable-lassopeptides] [--enable-t2pks] [--enable-html]
                 [--genefinding-tool {glimmerhmm,prodigal,prodigal-m,none,error}]
                 [--genefinding-gff3 GFF3_FILE] [--hmmdetection-strictness {strict,relaxed,loose}]
                 [--fullhmmer] [--fullhmmer-pfamdb-version FULLHMMER_PFAMDB_VERSION] [--cassis]
                 [--clusterhmmer] [--clusterhmmer-pfamdb-version CLUSTERHMMER_PFAMDB_VERSION]
                 [--sideload JSON] [--sideload-simple ACCESSION:START-END] [--tigrfam]
                 [--smcog-trees] [--tta-threshold TTA_THRESHOLD] [--cb-general] [--cb-subclusters]
                 [--cb-knownclusters] [--cb-nclusters count] [--cb-min-homology-scale LIMIT] [--asf]
                 [--pfam2go] [--rre] [--rre-cutoff RRE_CUTOFF] [--rre-minlength RRE_MIN_LENGTH]
                 [--cc-mibig] [--cc-custom-dbs FILE1,FILE2,...] [--html-title HTML_TITLE]
                 [--html-description HTML_DESCRIPTION] [--html-start-compact] [-h] [--help-showall]
                 [-c CPUS]
                 [SEQUENCE [SEQUENCE ...]]


arguments:
  SEQUENCE  GenBank/EMBL/FASTA file(s) containing DNA.

--------
Options
--------
-h, --help              Show this help text.
--help-showall          Show full lists of arguments on this help text.
-c CPUS, --cpus CPUS    How many CPUs to use in parallel. (default: 64)

Basic analysis options:

  --taxon {bacteria,fungi}
                        Taxonomic classification of input sequence. (default: bacteria)

Additional analysis:

  --fullhmmer           Run a whole-genome HMMer analysis.
  --cassis              Motif based prediction of SM gene cluster regions.
  --clusterhmmer        Run a cluster-limited HMMer analysis.
  --tigrfam             Annotate clusters using TIGRFam profiles.
  --smcog-trees         Generate phylogenetic trees of sec. met. cluster orthologous groups.
  --tta-threshold TTA_THRESHOLD
                        Lowest GC content to annotate TTA codons at (default: 0.65).
  --cb-general          Compare identified clusters against a database of antiSMASH-predicted
                        clusters.
  --cb-subclusters      Compare identified clusters against known subclusters responsible for
                        synthesising precursors.
  --cb-knownclusters    Compare identified clusters against known gene clusters from the MIBiG
                        database.
  --asf                 Run active site finder analysis.
  --pfam2go             Run Pfam to Gene Ontology mapping module.
  --rre                 Run RREFinder precision mode on all RiPP gene clusters.
  --cc-mibig            Run a comparison against the MIBiG dataset

Output options:

  --output-dir OUTPUT_DIR
                        Directory to write results to.
  --output-basename OUTPUT_BASENAME
                        Base filename to use for output files within the output directory.
  --html-title HTML_TITLE
                        Custom title for the HTML output page (default is input filename).
  --html-description HTML_DESCRIPTION
                        Custom description to add to the output.
  --html-start-compact  Use compact view by default for overview page.

Advanced options:

  --reuse-results PATH  Use the previous results from the specified json datafile
  --limit LIMIT         Only process the first <limit> records (default: -1). -1 to disable
  --minlength MINLENGTH
                        Only process sequences larger than <minlength> (default: 1000).
  --start START         Start analysis at nucleotide specified.
  --end END             End analysis at nucleotide specified
  --databases PATH      Root directory of the databases (default:
                        /local/cluster/antismash-6.0.1/lib/python3.8/site-
                        packages/antismash/databases).
  --write-config-file PATH
                        Write a config file to the supplied path
  --without-fimo        Run without FIMO (lowers accuracy of RiPP precursor predictions)
  --executable-paths EXECUTABLE=PATH,EXECUTABLE2=PATH2,...
                        A comma separated list of executable name->path pairs to override any on the
                        system path.E.g. diamond=/alternate/path/to/diamond,hmmpfam2=hmm2pfam
  --allow-long-headers  Prevents long headers from being renamed
  --hmmdetection-strictness {strict,relaxed,loose}
                        Defines which level of strictness to use for HMM-based cluster detection,
                        (default: relaxed).
  --sideload JSON       Sideload annotations from the JSON file in the given paths. Multiple files
                        can be provided, separated by a comma.
  --sideload-simple ACCESSION:START-END
                        Sideload a single subregion in record ACCESSION from START to END. Positions
                        are expected to be 0-indexed, with START inclusive and END exclusive.

Debugging & Logging options:

  -v, --verbose         Print verbose status information to stderr.
  -d, --debug           Print debugging information to stderr.
  --logfile PATH        Also write logging output to a file.
  --list-plugins        List all available sec. met. detection modules.
  --check-prereqs       Just check if all prerequisites are met.
  --limit-to-record RECORD_ID
                        Limit analysis to the record with ID record_id
  -V, --version         Display the version number and exit.
  --profiling           Generate a profiling report, disables multiprocess python.
  --skip-sanitisation   Skip input record sanitisation. Use with care.
  --skip-zip-file       Do not create a zip of the output

Debugging options for cluster-specific analyses:

  --minimal             Only run core detection modules, no analysis modules unless explicitly
                        enabled
  --enable-genefunctions
                        Enable Gene function annotations (default: enabled, unless --minimal is
                        specified)
  --enable-tta          Enable TTA detection (default: enabled, unless --minimal is specified)
  --enable-lanthipeptides
                        Enable Lanthipeptides (default: enabled, unless --minimal is specified)
  --enable-thiopeptides
                        Enable Thiopeptides (default: enabled, unless --minimal is specified)
  --enable-nrps-pks     Enable NRPS/PKS analysis (default: enabled, unless --minimal is specified)
  --enable-sactipeptides
                        Enable sactipeptide detection (default: enabled, unless --minimal is
                        specified)
  --enable-lassopeptides
                        Enable lassopeptide precursor prediction (default: enabled, unless --minimal
                        is specified)
  --enable-t2pks        Enable type II PKS analysis (default: enabled, unless --minimal is specified)
  --enable-html         Enable HTML output (default: enabled, unless --minimal is specified)

Gene finding options (ignored when ORFs are annotated):

  --genefinding-tool {glimmerhmm,prodigal,prodigal-m,none,error}
                        Specify algorithm used for gene finding: GlimmerHMM, Prodigal, Prodigal
                        Metagenomic/Anonymous mode, or none. The 'error' option will raise an error
                        if genefinding is attempted. The 'none' option will not run genefinding.
                        (default: error).
  --genefinding-gff3 GFF3_FILE
                        Specify GFF3 file to extract features from.

Full HMMer options:

  --fullhmmer-pfamdb-version FULLHMMER_PFAMDB_VERSION
                        PFAM database version number (e.g. 27.0) (default: latest).

Cluster HMMer options:

  --clusterhmmer-pfamdb-version CLUSTERHMMER_PFAMDB_VERSION
                        PFAM database version number (e.g. 27.0) (default: latest).

TIGRFam options:

NRPS/PKS options:

ClusterBlast options:

  --cb-nclusters count  Number of clusters from ClusterBlast to display, cannot be greater than 50.
                        (default: 10)
  --cb-min-homology-scale LIMIT
                        A minimum scaling factor for the query BGC in ClusterBlast results. Valid
                        range: 0.0 - 1.0. Warning: some homologous genes may no longer be visible!
                        (default: 0.0)

RREfinder options:

  --rre-cutoff RRE_CUTOFF
                        Bitscore cutoff for RRE pHMM detection (default: 25.0).
  --rre-minlength RRE_MIN_LENGTH
                        Minimum amino acid length of RRE domains (default: 50).

ClusterCompare options:

  --cc-custom-dbs FILE1,FILE2,...
                        A comma separated list of database config files to run with

Usage:

Fast run

Running antismash without parameters will run the core detection modules and all fast cluster-specific analysis steps. More time-consuming options such as the ClusterBlast analyses, cluster-based PFAM annotations, smCoG tree generation, etc. will not be run. On a quad-core machine, running the Streptomyces coelicolor genome with these options will take about two minutes.

This is how the antiSMASH web service runs fast mode jobs from https://fast.antismash.secondarymetabolites.org/

1

antismash streptomyces_coelicolor.gbk

Minimal run

Running antismash with the –minimal parameter will only run the core detection modules, none of the cluster-specific analysis steps. On a quad-core machine, running the Streptomyces coelicolor genome in minimal mode will take about one minute. In general, we recommend running without the –minimal option, as a default fast run will generate much more useful results for only one additional minute of runtime.

1

antismash --minimal streptomyces_coelicolor.gbk

Full-featured run

On a quad core machine, running all these options for the Streptomyces coelicolor genome will take a bit over 20 minutes.

1

antismash --cb-general --cb-knownclusters --cb-subclusters --asf --pfam2go --smcog-trees streptomyces_coelicolor.gbk

software ref: https://docs.antismash.secondarymetabolites.org
software ref: https://antismash.secondarymetabolites.org
research ref: https://doi.org/10.1093/nar/gkab335