IQ TREE 2.1.3

2021-07-30 2341 words 11 minutes

IQ-TREE - Efficient software for phylogenomic inference

Location and version:

1
2
3
4
5
6


$ which iqtree2
/local/cluster/bin/iqtree2
$ iqtree2 --version
IQ-TREE multicore version 2.1.3 COVID-edition for Linux 64-bit built Apr 21 2021
Developed by Bui Quang Minh, James Barbetti, Nguyen Lam Tung,
Olga Chernomor, Heiko Schmidt, Dominik Schrempf, Michael Woodhams.

help message:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281


$ iqtree2 --help
IQ-TREE multicore version 2.1.3 COVID-edition for Linux 64-bit built Apr 21 2021
Developed by Bui Quang Minh, James Barbetti, Nguyen Lam Tung,
Olga Chernomor, Heiko Schmidt, Dominik Schrempf, Michael Woodhams.

Usage: iqtree [-s ALIGNMENT] [-p PARTITION] [-m MODEL] [-t TREE] ...

GENERAL OPTIONS:
  -h, --help           Print (more) help usages
  -s FILE[,...,FILE]   PHYLIP/FASTA/NEXUS/CLUSTAL/MSF alignment file(s)
  -s DIR               Directory of alignment files
  --seqtype STRING     BIN, DNA, AA, NT2AA, CODON, MORPH (default: auto-detect)
  -t FILE|PARS|RAND    Starting tree (default: 99 parsimony and BIONJ)
  -o TAX[,...,TAX]     Outgroup taxon (list) for writing .treefile
  --prefix STRING      Prefix for all output files (default: aln/partition)
  --seed NUM           Random seed number, normally used for debugging purpose
  --safe               Safe likelihood kernel to avoid numerical underflow
  --mem NUM[G|M|%]     Maximal RAM usage in GB | MB | %
  --runs NUM           Number of indepedent runs (default: 1)
  -v, --verbose        Verbose mode, printing more messages to screen
  -V, --version        Display version number
  --quiet              Quiet mode, suppress printing to screen (stdout)
  -fconst f1,...,fN    Add constant patterns into alignment (N=no. states)
  --epsilon NUM        Likelihood epsilon for parameter estimate (default 0.01)
  -T NUM|AUTO          No. cores/threads or AUTO-detect (default: 1)
  --threads-max NUM    Max number of threads for -T AUTO (default: all cores)

CHECKPOINT:
  --redo               Redo both ModelFinder and tree search
  --redo-tree          Restore ModelFinder and only redo tree search
  --undo               Revoke finished run, used when changing some options
  --cptime NUM         Minimum checkpoint interval (default: 60 sec and adapt)

PARTITION MODEL:
  -p FILE|DIR          NEXUS/RAxML partition file or directory with alignments
                       Edge-linked proportional partition model
  -q FILE|DIR          Like -p but edge-linked equal partition model
  -Q FILE|DIR          Like -p but edge-unlinked partition model
  -S FILE|DIR          Like -p but separate tree inference
  --subsample NUM      Randomly sub-sample partitions (negative for complement)
  --subsample-seed NUM Random number seed for --subsample

LIKELIHOOD/QUARTET MAPPING:
  --lmap NUM           Number of quartets for likelihood mapping analysis
  --lmclust FILE       NEXUS file containing clusters for likelihood mapping
  --quartetlh          Print quartet log-likelihoods to .quartetlh file

TREE SEARCH ALGORITHM:
  --ninit NUM          Number of initial parsimony trees (default: 100)
  --ntop NUM           Number of top initial trees (default: 20)
  --nbest NUM          Number of best trees retained during search (defaut: 5)
  -n NUM               Fix number of iterations to stop (default: OFF)
  --nstop NUM          Number of unsuccessful iterations to stop (default: 100)
  --perturb NUM        Perturbation strength for randomized NNI (default: 0.5)
  --radius NUM         Radius for parsimony SPR search (default: 6)
  --allnni             Perform more thorough NNI search (default: OFF)
  -g FILE              (Multifurcating) topological constraint tree file
  --fast               Fast search to resemble FastTree
  --polytomy           Collapse near-zero branches into polytomy
  --tree-fix           Fix -t tree (no tree search performed)
  --treels             Write locally optimal trees into .treels file
  --show-lh            Compute tree likelihood without optimisation
  --terrace            Check if the tree lies on a phylogenetic terrace

ULTRAFAST BOOTSTRAP/JACKKNIFE:
  -B, --ufboot NUM     Replicates for ultrafast bootstrap (>=1000)
  -J, --ufjack NUM     Replicates for ultrafast jackknife (>=1000)
  --jack-prop NUM      Subsampling proportion for jackknife (default: 0.5)
  --sampling STRING    GENE|GENESITE resampling for partitions (default: SITE)
  --boot-trees         Write bootstrap trees to .ufboot file (default: none)
  --wbtl               Like --boot-trees but also writing branch lengths
  --nmax NUM           Maximum number of iterations (default: 1000)
  --nstep NUM          Iterations for UFBoot stopping rule (default: 100)
  --bcor NUM           Minimum correlation coefficient (default: 0.99)
  --beps NUM           RELL epsilon to break tie (default: 0.5)
  --bnni               Optimize UFBoot trees by NNI on bootstrap alignment

NON-PARAMETRIC BOOTSTRAP/JACKKNIFE:
  -b, --boot NUM       Replicates for bootstrap + ML tree + consensus tree
  -j, --jack NUM       Replicates for jackknife + ML tree + consensus tree
  --jack-prop NUM      Subsampling proportion for jackknife (default: 0.5)
  --bcon NUM           Replicates for bootstrap + consensus tree
  --bonly NUM          Replicates for bootstrap only
  --tbe                Transfer bootstrap expectation

SINGLE BRANCH TEST:
  --alrt NUM           Replicates for SH approximate likelihood ratio test
  --alrt 0             Parametric aLRT test (Anisimova and Gascuel 2006)
  --abayes             approximate Bayes test (Anisimova et al. 2011)
  --lbp NUM            Replicates for fast local bootstrap probabilities

MODEL-FINDER:
  -m TESTONLY          Standard model selection (like jModelTest, ProtTest)
  -m TEST              Standard model selection followed by tree inference
  -m MF                Extended model selection with FreeRate heterogeneity
  -m MFP               Extended model selection followed by tree inference
  -m ...+LM            Additionally test Lie Markov models
  -m ...+LMRY          Additionally test Lie Markov models with RY symmetry
  -m ...+LMWS          Additionally test Lie Markov models with WS symmetry
  -m ...+LMMK          Additionally test Lie Markov models with MK symmetry
  -m ...+LMSS          Additionally test strand-symmetric models
  --mset STRING        Restrict search to models supported by other programs
                       (raxml, phyml or mrbayes)
  --mset STR,...       Comma-separated model list (e.g. -mset WAG,LG,JTT)
  --msub STRING        Amino-acid model source
                       (nuclear, mitochondrial, chloroplast or viral)
  --mfreq STR,...      List of state frequencies
  --mrate STR,...      List of rate heterogeneity among sites
                       (e.g. -mrate E,I,G,I+G,R is used for -m MF)
  --cmin NUM           Min categories for FreeRate model [+R] (default: 2)
  --cmax NUM           Max categories for FreeRate model [+R] (default: 10)
  --merit AIC|AICc|BIC  Akaike|Bayesian information criterion (default: BIC)
  --mtree              Perform full tree search for every model
  --madd STR,...       List of mixture models to consider
  --mdef FILE          Model definition NEXUS file (see Manual)
  --modelomatic        Find best codon/protein/DNA models (Whelan et al. 2015)

PARTITION-FINDER:
  --merge              Merge partitions to increase model fit
  --merge greedy|rcluster|rclusterf
                       Set merging algorithm (default: rclusterf)
  --merge-model 1|all  Use only 1 or all models for merging (default: 1)
  --merge-model STR,...
                       Comma-separated model list for merging
  --merge-rate 1|all   Use only 1 or all rate heterogeneity (default: 1)
  --merge-rate STR,...
                       Comma-separated rate list for merging
  --rcluster NUM       Percentage of partition pairs for rcluster algorithm
  --rclusterf NUM      Percentage of partition pairs for rclusterf algorithm
  --rcluster-max NUM   Max number of partition pairs (default: 10*partitions)

SUBSTITUTION MODEL:
  -m STRING            Model name string (e.g. GTR+F+I+G)
                 DNA:  HKY (default), JC, F81, K2P, K3P, K81uf, TN/TrN, TNef,
                       TIM, TIMef, TVM, TVMef, SYM, GTR, or 6-digit model
                       specification (e.g., 010010 = HKY)
             Protein:  LG (default), Poisson, cpREV, mtREV, Dayhoff, mtMAM,
                       JTT, WAG, mtART, mtZOA, VT, rtREV, DCMut, PMB, HIVb,
                       HIVw, JTTDCMut, FLU, Blosum62, GTR20, mtMet, mtVer, mtInv, FLAVI,
			Q.LG, Q.pfam, Q.pfam_gb, Q.bird, Q.mammal, Q.insect, Q.plant, Q.yeast
     Protein mixture:  C10,...,C60, EX2, EX3, EHO, UL2, UL3, EX_EHO, LG4M, LG4X
              Binary:  JC2 (default), GTR2
     Empirical codon:  KOSI07, SCHN05
   Mechanistic codon:  GY (default), MG, MGK, GY0K, GY1KTS, GY1KTV, GY2K,
                       MG1KTS, MG1KTV, MG2K
Semi-empirical codon:  XX_YY where XX is empirical and YY is mechanistic model
      Morphology/SNP:  MK (default), ORDERED, GTR
      Lie Markov DNA:  1.1, 2.2b, 3.3a, 3.3b, 3.3c, 3.4, 4.4a, 4.4b, 4.5a,
                       4.5b, 5.6a, 5.6b, 5.7a, 5.7b, 5.7c, 5.11a, 5.11b, 5.11c,
                       5.16, 6.6, 6.7a, 6.7b, 6.8a, 6.8b, 6.17a, 6.17b, 8.8,
                       8.10a, 8.10b, 8.16, 8.17, 8.18, 9.20a, 9.20b, 10.12,
                       10.34, 12.12 (optionally prefixed by RY, WS or MK)
      Non-reversible:  STRSYM (strand symmetric model, equiv. WS6.6),
                       NONREV, UNREST (unrestricted model, equiv. 12.12)
           Otherwise:  Name of file containing user-model parameters

STATE FREQUENCY:
  -m ...+F             Empirically counted frequencies from alignment
  -m ...+FO            Optimized frequencies by maximum-likelihood
  -m ...+FQ            Equal frequencies
  -m ...+FRY           For DNA, freq(A+G)=1/2=freq(C+T)
  -m ...+FWS           For DNA, freq(A+T)=1/2=freq(C+G)
  -m ...+FMK           For DNA, freq(A+C)=1/2=freq(G+T)
  -m ...+Fabcd         4-digit constraint on ACGT frequency
                       (e.g. +F1221 means f_A=f_T, f_C=f_G)
  -m ...+FU            Amino-acid frequencies given protein matrix
  -m ...+F1x4          Equal NT frequencies over three codon positions
  -m ...+F3x4          Unequal NT frequencies over three codon positions

RATE HETEROGENEITY AMONG SITES:
  -m ...+I             A proportion of invariable sites
  -m ...+G[n]          Discrete Gamma model with n categories (default n=4)
  -m ...*G[n]          Discrete Gamma model with unlinked model parameters
  -m ...+I+G[n]        Invariable sites plus Gamma model with n categories
  -m ...+R[n]          FreeRate model with n categories (default n=4)
  -m ...*R[n]          FreeRate model with unlinked model parameters
  -m ...+I+R[n]        Invariable sites plus FreeRate model with n categories
  -m ...+Hn            Heterotachy model with n classes
  -m ...*Hn            Heterotachy model with n classes and unlinked parameters
  --alpha-min NUM      Min Gamma shape parameter for site rates (default: 0.02)
  --gamma-median       Median approximation for +G site rates (default: mean)
  --rate               Write empirical Bayesian site rates to .rate file
  --mlrate             Write maximum likelihood site rates to .mlrate file

POLYMORPHISM AWARE MODELS (PoMo):
  -s FILE              Input counts file (see manual)
  -m ...+P             DNA substitution model (see above) used with PoMo
  -m ...+N<POPSIZE>    Virtual population size (default: 9)
  -m ...+WB|WH|S]      Weighted binomial sampling
  -m ...+WH            Weighted hypergeometric sampling
  -m ...+S             Sampled sampling
  -m ...+G[n]          Discrete Gamma rate with n categories (default n=4)

COMPLEX MODELS:
  -m "MIX{m1,...,mK}"  Mixture model with K components
  -m "FMIX{f1,...fK}"  Frequency mixture model with K components
  --mix-opt            Optimize mixture weights (default: detect)
  -m ...+ASC           Ascertainment bias correction
  --tree-freq FILE     Input tree to infer site frequency model
  --site-freq FILE     Input site frequency model file
  --freq-max           Posterior maximum instead of mean approximation

TREE TOPOLOGY TEST:
  --trees FILE         Set of trees to evaluate log-likelihoods
  --test NUM           Replicates for topology test
  --test-weight        Perform weighted KH and SH tests
  --test-au            Approximately unbiased (AU) test (Shimodaira 2002)
  --sitelh             Write site log-likelihoods to .sitelh file

ANCESTRAL STATE RECONSTRUCTION:
  --ancestral          Ancestral state reconstruction by empirical Bayes
  --asr-min NUM        Min probability of ancestral state (default: equil freq)

TEST OF SYMMETRY:
  --symtest               Perform three tests of symmetry
  --symtest-only          Do --symtest then exist
  --symtest-remove-bad    Do --symtest and remove bad partitions
  --symtest-remove-good   Do --symtest and remove good partitions
  --symtest-type MAR|INT  Use MARginal/INTernal test when removing partitions
  --symtest-pval NUMER    P-value cutoff (default: 0.05)
  --symtest-keep-zero     Keep NAs in the tests

CONCORDANCE FACTOR ANALYSIS:
  -t FILE              Reference tree to assign concordance factor
  --gcf FILE           Set of source trees for gene concordance factor (gCF)
  --df-tree            Write discordant trees associated with gDF1
  --scf NUM            Number of quartets for site concordance factor (sCF)
  -s FILE              Sequence alignment for --scf
  -p FILE|DIR          Partition file or directory for --scf
  --cf-verbose         Write CF per tree/locus to cf.stat_tree/_loci
  --cf-quartet         Write sCF for all resampled quartets to .cf.quartet

TIME TREE RECONSTRUCTION:
  --date FILE          File containing dates of tips or ancestral nodes
  --date TAXNAME       Extract dates from taxon names after last '|'
  --date-tip STRING    Tip dates as a real number or YYYY-MM-DD
  --date-root STRING   Root date as a real number or YYYY-MM-DD
  --date-ci NUM        Number of replicates to compute confidence interval
  --clock-sd NUM       Std-dev for lognormal relaxed clock (default: 0.2)
  --date-no-outgroup   Exclude outgroup from time tree
  --date-outlier NUM   Z-score cutoff to remove outlier tips/nodes (e.g. 3)
  --date-options ".."  Extra options passing directly to LSD2
  --dating STRING      Dating method: LSD for least square dating (default)


CONSENSUS RECONSTRUCTION:
  -t FILE              Set of input trees for consensus reconstruction
  --sup-min NUM        Min split support, 0.5 for majority-rule consensus
                       (default: 0, extended consensus)
  --burnin NUM         Burnin number of trees to ignore
  --con-tree           Compute consensus tree to .contree file
  --con-net            Computing consensus network to .nex file
  --support FILE       Assign support values into this tree from -t trees
  --suptag STRING      Node name (or ALL) to assign tree IDs where node occurs

TREE DISTANCE BY ROBINSON-FOULDS (RF) METRIC:
  --tree-dist-all      Compute all-to-all RF distances for -t trees
  --tree-dist FILE     Compute RF distances between -t trees and this set
  --tree-dist2 FILE    Like -rf but trees can have unequal taxon sets

GENERATING RANDOM TREES:
  -r NUM               No. taxa for Yule-Harding random tree
  --rand UNI|CAT|BAL   UNIform | CATerpillar | BALanced random tree
  --rlen NUM NUM NUM   min, mean, and max random branch lengths

MISCELLANEOUS:
  --keep-ident         Keep identical sequences (default: remove & finally add)
  -blfix               Fix branch lengths of user tree passed via -te
  -blscale             Scale branch lengths of user tree passed via -t
  -blmin               Min branch length for optimization (default 0.000001)
  -blmax               Max branch length for optimization (default 100)
  -wslr                Write site log-likelihoods per rate category
  -wslm                Write site log-likelihoods per mixture class
  -wslmr               Write site log-likelihoods per mixture+rate class
  -wspr                Write site probabilities per rate category
  -wspm                Write site probabilities per mixture class
  -wspmr               Write site probabilities per mixture+rate class
  --partlh             Write partition log-likelihoods to .partlh file
  --no-outfiles        Suppress printing output files
  --eigenlib           Use Eigen3 library
  -alninfo             Print alignment sites statistics to .alninfo

Generally, you’ll want to align your proteins using MAFFT before moving on to tree generation using this program

Using -B 1000 and -alrt 1000 will provide two measures of support values; using respective cutoffs of >95% and >80% will give you the best supported branches.

When doing a partitioned analysis, use the -p over -q or -Q options.

For model search, use -m MFP and -msub nuclear for the vast majority of proteins is acceptable. If you are using viral or mitochondrial proteins, then there are other options for you.

NEW IN 2.1.3: Rooted phylogenies

software ref: http://www.iqtree.org/
research ref: https://doi.org/10.1093/molbev/msaa015
research ref: Depending on your options you’ll need to cite other papers