# MMseqs2 75af0


# MMseqs2: ultra fast and sensitive sequence search and clustering suite
MMseqs2 (Many-against-Many sequence searching) is a software suite to search
and cluster huge protein and nucleotide sequence sets. MMseqs2 is open source
GPL-licensed software implemented in C++ for Linux, MacOS, and (as beta
version, via cygwin) Windows. The software is designed to run on multiple
cores and servers and exhibits very good scalability. MMseqs2 can run 10000
times faster than BLAST. At 100 times its speed it achieves almost the same
sensitivity. It can perform profile searches with the same sensitivity as
PSI-BLAST at over 400 times its speed.

## Documentation
The MMseqs2 user guide is available in our [GitHub
Wiki](https://github.com/soedinglab/mmseqs2/wiki) or as a [PDF
file](https://mmseqs.com/latest/userguide.pdf) (Thanks to
[pandoc](https://github.com/jgm/pandoc)!). The wiki also contains
[tutorials](https://github.com/soedinglab/MMseqs2/wiki/Tutorials) to learn how
to use MMseqs2 with real data. For questions please open an issue on
[GitHub](https://github.com/soedinglab/MMseqs2/issues) or ask in our
[chat](https://chat.mmseqs.com). 

Keep posted about MMseqs2/Linclust updates by following Martin on
[Twitter](https://twitter.com/thesteinegger).

Location and version:

```console
[Linux@chrom1 bin]$ which mmseqs
/local/cluster/bin/mmseqs
[Linux@chrom1 bin]$ mmseqs version
75af0c82edf34587548bacc865cfa1d2261a9696
```

help message:

```console
$ mmseqs
MMseqs2 (Many against Many sequence searching) is an open-source software suite for very fast,
parallelized protein sequence searches and clustering of huge protein sequence data sets.

Please cite: M. Steinegger and J. Soding. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nature Biotechnology, doi:10.1038/nbt.3988 (2017).

MMseqs2 Version: 75af0c82edf34587548bacc865cfa1d2261a9696
© Martin Steinegger (martin.steinegger@snu.ac.kr)

usage: mmseqs <command> [<args>]

Easy workflows for plain text input/output
  easy-search       	Sensitive homology search
  easy-cluster      	Slower, sensitive clustering
  easy-linclust     	Fast linear time cluster, less sensitive clustering
  easy-taxonomy     	Taxonomic classification
  easy-rbh          	Find reciprocal best hit

Main workflows for database input/output
  search            	Sensitive homology search
  map               	Map nearly identical sequences
  rbh               	Reciprocal best hit search
  linclust          	Fast, less sensitive clustering
  cluster           	Slower, sensitive clustering
  clusterupdate     	Update previous clustering with new sequences
  taxonomy          	Taxonomic classification

Input database creation
  databases         	List and download databases
  createdb          	Convert FASTA/Q file(s) to a sequence DB
  createindex       	Store precomputed index on disk to reduce search overhead
  convertmsa        	Convert Stockholm/PFAM MSA file to a MSA DB
  msa2profile       	Convert a MSA DB to a profile DB

Format conversion for downstream processing
  convertalis       	Convert alignment DB to BLAST-tab, SAM or custom format
  createtsv         	Convert result DB to tab-separated flat file
  convert2fasta     	Convert sequence DB to FASTA format
  taxonomyreport    	Create a taxonomy report in Kraken or Krona format

An extended list of all modules can be obtained by calling 'mmseqs -h'.

Bash completion for modules and parameters can be installed by adding "source MMSEQS_HOME/util/bash-completion.sh" to your "$HOME/.bash_profile".
```

##  Publications

[Steinegger M and Soeding J. MMseqs2 enables sensitive protein sequence
searching for the analysis of massive data sets. Nature Biotechnology, doi:
10.1038/nbt.3988 (2017)](https://www.nature.com/articles/nbt.3988).

[Steinegger M and Soeding J. Clustering huge protein sequence sets in linear
time. Nature Communications, doi: 10.1038/s41467-018-04964-5
(2018)](https://www.nature.com/articles/s41467-018-04964-5).

[Mirdita M, Steinegger M and Soeding J. MMseqs2 desktop and local web server
app for fast, interactive sequence searches. Bioinformatics, doi:
10.1093/bioinformatics/bty1057
(2019)](https://academic.oup.com/bioinformatics/article/35/16/2856/5280135).

[Mirdita M, Steinegger M, Breitwieser F, Soding J, Levy Karin E: Fast and
sensitive taxonomic assignment to metagenomic contigs. Bioinformatics, doi:
10.1093/bioinformatics/btab184
(2021)](https://doi.org/10.1093/bioinformatics/btab184).

software ref: <https://github.com/soedinglab/mmseqs2>  
software ref: <https://github.com/soedinglab/MMseqs2/wiki>  
research ref: See above