# MMseqs2 75af0 # MMseqs2: ultra fast and sensitive sequence search and clustering suite MMseqs2 (Many-against-Many sequence searching) is a software suite to search and cluster huge protein and nucleotide sequence sets. MMseqs2 is open source GPL-licensed software implemented in C++ for Linux, MacOS, and (as beta version, via cygwin) Windows. The software is designed to run on multiple cores and servers and exhibits very good scalability. MMseqs2 can run 10000 times faster than BLAST. At 100 times its speed it achieves almost the same sensitivity. It can perform profile searches with the same sensitivity as PSI-BLAST at over 400 times its speed. ## Documentation The MMseqs2 user guide is available in our [GitHub Wiki](https://github.com/soedinglab/mmseqs2/wiki) or as a [PDF file](https://mmseqs.com/latest/userguide.pdf) (Thanks to [pandoc](https://github.com/jgm/pandoc)!). The wiki also contains [tutorials](https://github.com/soedinglab/MMseqs2/wiki/Tutorials) to learn how to use MMseqs2 with real data. For questions please open an issue on [GitHub](https://github.com/soedinglab/MMseqs2/issues) or ask in our [chat](https://chat.mmseqs.com). Keep posted about MMseqs2/Linclust updates by following Martin on [Twitter](https://twitter.com/thesteinegger). Location and version: ```console [Linux@chrom1 bin]$ which mmseqs /local/cluster/bin/mmseqs [Linux@chrom1 bin]$ mmseqs version 75af0c82edf34587548bacc865cfa1d2261a9696 ``` help message: ```console $ mmseqs MMseqs2 (Many against Many sequence searching) is an open-source software suite for very fast, parallelized protein sequence searches and clustering of huge protein sequence data sets. Please cite: M. Steinegger and J. Soding. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nature Biotechnology, doi:10.1038/nbt.3988 (2017). MMseqs2 Version: 75af0c82edf34587548bacc865cfa1d2261a9696 © Martin Steinegger (martin.steinegger@snu.ac.kr) usage: mmseqs [] Easy workflows for plain text input/output easy-search Sensitive homology search easy-cluster Slower, sensitive clustering easy-linclust Fast linear time cluster, less sensitive clustering easy-taxonomy Taxonomic classification easy-rbh Find reciprocal best hit Main workflows for database input/output search Sensitive homology search map Map nearly identical sequences rbh Reciprocal best hit search linclust Fast, less sensitive clustering cluster Slower, sensitive clustering clusterupdate Update previous clustering with new sequences taxonomy Taxonomic classification Input database creation databases List and download databases createdb Convert FASTA/Q file(s) to a sequence DB createindex Store precomputed index on disk to reduce search overhead convertmsa Convert Stockholm/PFAM MSA file to a MSA DB msa2profile Convert a MSA DB to a profile DB Format conversion for downstream processing convertalis Convert alignment DB to BLAST-tab, SAM or custom format createtsv Convert result DB to tab-separated flat file convert2fasta Convert sequence DB to FASTA format taxonomyreport Create a taxonomy report in Kraken or Krona format An extended list of all modules can be obtained by calling 'mmseqs -h'. Bash completion for modules and parameters can be installed by adding "source MMSEQS_HOME/util/bash-completion.sh" to your "$HOME/.bash_profile". ``` ## Publications [Steinegger M and Soeding J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nature Biotechnology, doi: 10.1038/nbt.3988 (2017)](https://www.nature.com/articles/nbt.3988). [Steinegger M and Soeding J. Clustering huge protein sequence sets in linear time. Nature Communications, doi: 10.1038/s41467-018-04964-5 (2018)](https://www.nature.com/articles/s41467-018-04964-5). [Mirdita M, Steinegger M and Soeding J. MMseqs2 desktop and local web server app for fast, interactive sequence searches. Bioinformatics, doi: 10.1093/bioinformatics/bty1057 (2019)](https://academic.oup.com/bioinformatics/article/35/16/2856/5280135). [Mirdita M, Steinegger M, Breitwieser F, Soding J, Levy Karin E: Fast and sensitive taxonomic assignment to metagenomic contigs. Bioinformatics, doi: 10.1093/bioinformatics/btab184 (2021)](https://doi.org/10.1093/bioinformatics/btab184). software ref: software ref: research ref: See above