# iq-tree2 2.2.0 {{< admonition success "Installed" true >}} This software should be available with no extra configuration. {{< /admonition >}} ## iqtree2-2.2.0 Thanks to the recent advent of next-generation sequencing techniques, the amount of phylogenomic/transcriptomic data have been rapidly accumulated. This extremely facilitates resolving many "deep phylogenetic" questions in the tree of life. At the same time it poses major computational challenges to analyze such big data, where most phylogenetic software cannot handle. Moreover, there is a need to develop more complex probabilistic models to adequately capture realistic aspects of genomic sequence evolution. This trends motivated us to develop the IQ-TREE software with a strong emphasis on phylogenomic inference. Our goals are: * __Accuracy__: Proposing novel computational methods that perform better than existing approaches. * __Speed__: Allowing fast analysis on big data sets and utilizing high performance computing platforms. * __Flexibility__: Facilitating the inclusion of new (phylogenomic) models and sequence data types. * __Versatility__: Implementing a broad range of commonly-used maximum likelihood analyses. IQ-TREE has been developed since 2011 and freely available at as open-source software under the [GNU-GPL license version 2](http://www.gnu.org/licenses/licenses.en.html). It is actively maintained by the core development team (see below) and a number of collabrators. The name IQ-TREE comes from the fact that it is the successor of [**IQ**PNNI](http://www.cibiv.at/software/iqpnni/) and [**TREE**-PUZZLE](http://www.tree-puzzle.de/) software. ### Key features * __Efficient search algorithm__: Fast and effective stochastic algorithm to reconstruct phylogenetic trees by maximum likelihood. IQ-TREE compares favorably to RAxML and PhyML in terms of likelihood while requiring similar amount of computing time ([Nguyen et al., 2015]). * __Ultrafast bootstrap__: An ultrafast bootstrap approximation (UFBoot) to assess branch supports. UFBoot is 10 to 40 times faster than RAxML rapid bootstrap and obtains less biased support values ([Minh et al., 2013]; [Hoang et al., 2018]). * __Ultrafast model selection__: An ultrafast and automatic model selection (ModelFinder) which is 10 to 100 times faster than jModelTest and ProtTest. ModelFinder also finds best-fit partitioning scheme like PartitionFinder. * __Big Data Analysis__: Supporting huge datasets with thousands of sequences or millions of alignment sites via checkpointing, safe numerical and low memory mode. Multicore CPUs and parallel MPI system are utilized to speedup analysis. * __Phylogenetic testing__: Several fast branch tests like SH-aLRT and aBayes test ([Anisimova et al., 2011]) and tree topology tests like the approximately unbiased (AU) test ([Shimodaira, 2002]). ------------------------------------------------------------------------------- ## Location and version ```console $ which iqtree2 /local/cluster/bin/iqtree2 $ iqtree2 --version IQ-TREE multicore version 2.2.0 COVID-edition for Linux 64-bit built Jun 1 2022 Developed by Bui Quang Minh, James Barbetti, Nguyen Lam Tung, Olga Chernomor, Heiko Schmidt, Dominik Schrempf, Michael Woodhams, Ly Trong Nhan. ``` ## help message ```console $ iqtree2 --help IQ-TREE multicore version 2.2.0 COVID-edition for Linux 64-bit built Jun 1 2022 Developed by Bui Quang Minh, James Barbetti, Nguyen Lam Tung, Olga Chernomor, Heiko Schmidt, Dominik Schrempf, Michael Woodhams, Ly Trong Nhan. Usage: iqtree [-s ALIGNMENT] [-p PARTITION] [-m MODEL] [-t TREE] ... GENERAL OPTIONS: -h, --help Print (more) help usages -s FILE[,...,FILE] PHYLIP/FASTA/NEXUS/CLUSTAL/MSF alignment file(s) -s DIR Directory of alignment files --seqtype STRING BIN, DNA, AA, NT2AA, CODON, MORPH (default: auto-detect) -t FILE|PARS|RAND Starting tree (default: 99 parsimony and BIONJ) -o TAX[,...,TAX] Outgroup taxon (list) for writing .treefile --prefix STRING Prefix for all output files (default: aln/partition) --seed NUM Random seed number, normally used for debugging purpose --safe Safe likelihood kernel to avoid numerical underflow --mem NUM[G|M|%] Maximal RAM usage in GB | MB | % --runs NUM Number of indepedent runs (default: 1) -v, --verbose Verbose mode, printing more messages to screen -V, --version Display version number --quiet Quiet mode, suppress printing to screen (stdout) -fconst f1,...,fN Add constant patterns into alignment (N=no. states) --epsilon NUM Likelihood epsilon for parameter estimate (default 0.01) -T NUM|AUTO No. cores/threads or AUTO-detect (default: 1) --threads-max NUM Max number of threads for -T AUTO (default: all cores) --export-alisim-cmd Export a command-line from the inferred tree and model params to simulate new MSAs with AliSim CHECKPOINT: --redo Redo both ModelFinder and tree search --redo-tree Restore ModelFinder and only redo tree search --undo Revoke finished run, used when changing some options --cptime NUM Minimum checkpoint interval (default: 60 sec and adapt) PARTITION MODEL: -p FILE|DIR NEXUS/RAxML partition file or directory with alignments Edge-linked proportional partition model -q FILE|DIR Like -p but edge-linked equal partition model -Q FILE|DIR Like -p but edge-unlinked partition model -S FILE|DIR Like -p but separate tree inference --subsample NUM Randomly sub-sample partitions (negative for complement) --subsample-seed NUM Random number seed for --subsample LIKELIHOOD/QUARTET MAPPING: --lmap NUM Number of quartets for likelihood mapping analysis --lmclust FILE NEXUS file containing clusters for likelihood mapping --quartetlh Print quartet log-likelihoods to .quartetlh file TREE SEARCH ALGORITHM: --ninit NUM Number of initial parsimony trees (default: 100) --ntop NUM Number of top initial trees (default: 20) --nbest NUM Number of best trees retained during search (defaut: 5) -n NUM Fix number of iterations to stop (default: OFF) --nstop NUM Number of unsuccessful iterations to stop (default: 100) --perturb NUM Perturbation strength for randomized NNI (default: 0.5) --radius NUM Radius for parsimony SPR search (default: 6) --allnni Perform more thorough NNI search (default: OFF) -g FILE (Multifurcating) topological constraint tree file --fast Fast search to resemble FastTree --polytomy Collapse near-zero branches into polytomy --tree-fix Fix -t tree (no tree search performed) --treels Write locally optimal trees into .treels file --show-lh Compute tree likelihood without optimisation --terrace Check if the tree lies on a phylogenetic terrace ULTRAFAST BOOTSTRAP/JACKKNIFE: -B, --ufboot NUM Replicates for ultrafast bootstrap (>=1000) -J, --ufjack NUM Replicates for ultrafast jackknife (>=1000) --jack-prop NUM Subsampling proportion for jackknife (default: 0.5) --sampling STRING GENE|GENESITE resampling for partitions (default: SITE) --boot-trees Write bootstrap trees to .ufboot file (default: none) --wbtl Like --boot-trees but also writing branch lengths --nmax NUM Maximum number of iterations (default: 1000) --nstep NUM Iterations for UFBoot stopping rule (default: 100) --bcor NUM Minimum correlation coefficient (default: 0.99) --beps NUM RELL epsilon to break tie (default: 0.5) --bnni Optimize UFBoot trees by NNI on bootstrap alignment NON-PARAMETRIC BOOTSTRAP/JACKKNIFE: -b, --boot NUM Replicates for bootstrap + ML tree + consensus tree -j, --jack NUM Replicates for jackknife + ML tree + consensus tree --jack-prop NUM Subsampling proportion for jackknife (default: 0.5) --bcon NUM Replicates for bootstrap + consensus tree --bonly NUM Replicates for bootstrap only --tbe Transfer bootstrap expectation SINGLE BRANCH TEST: --alrt NUM Replicates for SH approximate likelihood ratio test --alrt 0 Parametric aLRT test (Anisimova and Gascuel 2006) --abayes approximate Bayes test (Anisimova et al. 2011) --lbp NUM Replicates for fast local bootstrap probabilities MODEL-FINDER: -m TESTONLY Standard model selection (like jModelTest, ProtTest) -m TEST Standard model selection followed by tree inference -m MF Extended model selection with FreeRate heterogeneity -m MFP Extended model selection followed by tree inference -m ...+LM Additionally test Lie Markov models -m ...+LMRY Additionally test Lie Markov models with RY symmetry -m ...+LMWS Additionally test Lie Markov models with WS symmetry -m ...+LMMK Additionally test Lie Markov models with MK symmetry -m ...+LMSS Additionally test strand-symmetric models --mset STRING Restrict search to models supported by other programs (raxml, phyml, mrbayes, beast1 or beast2) --mset STR,... Comma-separated model list (e.g. -mset WAG,LG,JTT) --msub STRING Amino-acid model source (nuclear, mitochondrial, chloroplast or viral) --mfreq STR,... List of state frequencies --mrate STR,... List of rate heterogeneity among sites (e.g. -mrate E,I,G,I+G,R is used for -m MF) --cmin NUM Min categories for FreeRate model [+R] (default: 2) --cmax NUM Max categories for FreeRate model [+R] (default: 10) --merit AIC|AICc|BIC Akaike|Bayesian information criterion (default: BIC) --mtree Perform full tree search for every model --madd STR,... List of mixture models to consider --mdef FILE Model definition NEXUS file (see Manual) --modelomatic Find best codon/protein/DNA models (Whelan et al. 2015) PARTITION-FINDER: --merge Merge partitions to increase model fit --merge greedy|rcluster|rclusterf Set merging algorithm (default: rclusterf) --merge-model 1|all Use only 1 or all models for merging (default: 1) --merge-model STR,... Comma-separated model list for merging --merge-rate 1|all Use only 1 or all rate heterogeneity (default: 1) --merge-rate STR,... Comma-separated rate list for merging --rcluster NUM Percentage of partition pairs for rcluster algorithm --rclusterf NUM Percentage of partition pairs for rclusterf algorithm --rcluster-max NUM Max number of partition pairs (default: 10*partitions) SUBSTITUTION MODEL: -m STRING Model name string (e.g. GTR+F+I+G) DNA: HKY (default), JC, F81, K2P, K3P, K81uf, TN/TrN, TNef, TIM, TIMef, TVM, TVMef, SYM, GTR, or 6-digit model specification (e.g., 010010 = HKY) Protein: LG (default), Poisson, cpREV, mtREV, Dayhoff, mtMAM, JTT, WAG, mtART, mtZOA, VT, rtREV, DCMut, PMB, HIVb, HIVw, JTTDCMut, FLU, Blosum62, GTR20, mtMet, mtVer, mtInv, FLAVI, Q.LG, Q.pfam, Q.pfam_gb, Q.bird, Q.mammal, Q.insect, Q.plant, Q.yeast Protein mixture: C10,...,C60, EX2, EX3, EHO, UL2, UL3, EX_EHO, LG4M, LG4X Binary: JC2 (default), GTR2 Empirical codon: KOSI07, SCHN05 Mechanistic codon: GY (default), MG, MGK, GY0K, GY1KTS, GY1KTV, GY2K, MG1KTS, MG1KTV, MG2K Semi-empirical codon: XX_YY where XX is empirical and YY is mechanistic model Morphology/SNP: MK (default), ORDERED, GTR Lie Markov DNA: 1.1, 2.2b, 3.3a, 3.3b, 3.3c, 3.4, 4.4a, 4.4b, 4.5a, 4.5b, 5.6a, 5.6b, 5.7a, 5.7b, 5.7c, 5.11a, 5.11b, 5.11c, 5.16, 6.6, 6.7a, 6.7b, 6.8a, 6.8b, 6.17a, 6.17b, 8.8, 8.10a, 8.10b, 8.16, 8.17, 8.18, 9.20a, 9.20b, 10.12, 10.34, 12.12 (optionally prefixed by RY, WS or MK) Non-reversible: STRSYM (strand symmetric model, equiv. WS6.6), NONREV, UNREST (unrestricted model, equiv. 12.12) Otherwise: Name of file containing user-model parameters STATE FREQUENCY: -m ...+F Empirically counted frequencies from alignment -m ...+FO Optimized frequencies by maximum-likelihood -m ...+FQ Equal frequencies -m ...+FRY For DNA, freq(A+G)=1/2=freq(C+T) -m ...+FWS For DNA, freq(A+T)=1/2=freq(C+G) -m ...+FMK For DNA, freq(A+C)=1/2=freq(G+T) -m ...+Fabcd 4-digit constraint on ACGT frequency (e.g. +F1221 means f_A=f_T, f_C=f_G) -m ...+FU Amino-acid frequencies given protein matrix -m ...+F1x4 Equal NT frequencies over three codon positions -m ...+F3x4 Unequal NT frequencies over three codon positions RATE HETEROGENEITY AMONG SITES: -m ...+I A proportion of invariable sites -m ...+G[n] Discrete Gamma model with n categories (default n=4) -m ...*G[n] Discrete Gamma model with unlinked model parameters -m ...+I+G[n] Invariable sites plus Gamma model with n categories -m ...+R[n] FreeRate model with n categories (default n=4) -m ...*R[n] FreeRate model with unlinked model parameters -m ...+I+R[n] Invariable sites plus FreeRate model with n categories -m ...+Hn Heterotachy model with n classes -m ...*Hn Heterotachy model with n classes and unlinked parameters --alpha-min NUM Min Gamma shape parameter for site rates (default: 0.02) --gamma-median Median approximation for +G site rates (default: mean) --rate Write empirical Bayesian site rates to .rate file --mlrate Write maximum likelihood site rates to .mlrate file POLYMORPHISM AWARE MODELS (PoMo): -s FILE Input counts file (see manual) -m ...+P DNA substitution model (see above) used with PoMo -m ...+N Virtual population size (default: 9) -m ...+WB|WH|S] Weighted binomial sampling -m ...+WH Weighted hypergeometric sampling -m ...+S Sampled sampling -m ...+G[n] Discrete Gamma rate with n categories (default n=4) COMPLEX MODELS: -m "MIX{m1,...,mK}" Mixture model with K components -m "FMIX{f1,...fK}" Frequency mixture model with K components --mix-opt Optimize mixture weights (default: detect) -m ...+ASC Ascertainment bias correction --tree-freq FILE Input tree to infer site frequency model --site-freq FILE Input site frequency model file --freq-max Posterior maximum instead of mean approximation TREE TOPOLOGY TEST: --trees FILE Set of trees to evaluate log-likelihoods --test NUM Replicates for topology test --test-weight Perform weighted KH and SH tests --test-au Approximately unbiased (AU) test (Shimodaira 2002) --sitelh Write site log-likelihoods to .sitelh file ANCESTRAL STATE RECONSTRUCTION: --ancestral Ancestral state reconstruction by empirical Bayes --asr-min NUM Min probability of ancestral state (default: equil freq) TEST OF SYMMETRY: --symtest Perform three tests of symmetry --symtest-only Do --symtest then exist --symtest-remove-bad Do --symtest and remove bad partitions --symtest-remove-good Do --symtest and remove good partitions --symtest-type MAR|INT Use MARginal/INTernal test when removing partitions --symtest-pval NUMER P-value cutoff (default: 0.05) --symtest-keep-zero Keep NAs in the tests CONCORDANCE FACTOR ANALYSIS: -t FILE Reference tree to assign concordance factor --gcf FILE Set of source trees for gene concordance factor (gCF) --df-tree Write discordant trees associated with gDF1 --scf NUM Number of quartets for site concordance factor (sCF) -s FILE Sequence alignment for --scf -p FILE|DIR Partition file or directory for --scf --cf-verbose Write CF per tree/locus to cf.stat_tree/_loci --cf-quartet Write sCF for all resampled quartets to .cf.quartet ALISIM: ALIGNMENT SIMULATOR Usage: iqtree --alisim [-m MODEL] [-t TREE] ... --alisim OUTPUT_ALIGNMENT Activate AliSim and specify the output alignment filename -t TREE_FILE Set the input tree file name --length LENGTH Set the length of the root sequence --num-alignments NUMBER Set the number of output datasets --seqtype STRING BIN, DNA, AA, CODON, MORPH{NUM_STATES} (default: auto-detect) For morphological data, 0, Set the insertion and deletion rate of the indel model, relative to the substitution rate --indel-size , Set the insertion and deletion size distributions --sub-level-mixture Enable the feature to simulate substitution-level mixture model --no-unaligned Disable outputing a file of unaligned sequences when using indel models --root-seq FILE,SEQ_NAME Specify the root sequence from an alignment -s FILE Specify the input sequence alignment --no-copy-gaps Disable copying gaps from input alignment (default: false) --site-freq