Centrifuge - Classifier for metagenomic sequences [Centrifuge] is a novel microbial classification engine that enables rapid, accurate and sensitive labeling of reads and quantification of species on desktop computers. The system uses a novel indexing scheme based on the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini (FM) index, optimized specifically for the metagenomic classification problem. Centrifuge requires a relatively small index (4.7 GB for all complete bacterial and viral genomes plus the human genome) and classifies sequences at very high speed, allowing it to process the millions of reads from a typical high-throughput DNA sequencing run within a few minutes.
Merqury - Evaluate genome assemblies with k-mers and more Often, genome assembly projects have illumina whole genome sequencing reads available for the assembled individual. The k-mer spectrum of this read set can be used for independently evaluating assembly quality without the need of a high quality reference. Merqury provides a set of tools for this purpose.
Location and version:
1 2 $ which merqury.sh /local/cluster/bin/merqury.sh help message:
MITObim - mitochondrial baiting and iterative mapping The pipeline was originally developed for Illumina data, but thanks to the versatility of the MIRA assembler, MITObim supports in principle also data from the Iontorrent, 454 and PacBio sequencing platforms.
Location and version:
1 2 3 4 5 6 7 $ which MITObim.pl /local/cluster/bin/MITObim.pl $ MITObim.pl --version MITObim - mitochondrial baiting and iterative mapping version 1.9.1 help message:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 $ MITObim.
Miniasm Miniasm is a very fast OLC-based de novo assembler for noisy long reads. It takes all-vs-all read self-mappings (typically by minimap) as input and outputs an assembly graph in the GFA format. Different from mainstream assemblers, miniasm does not have a consensus step. It simply concatenates pieces of read sequences to generate the final unitig sequences. Thus the per-base error rate is similar to the raw input reads.
Location and version:
NextDenovo NextDenovo is a string graph-based de novo assembler for long reads (CLR, HiFi and ONT). It uses a “correct-then-assemble” strategy similar to canu (no correction step for PacBio Hifi reads), but requires significantly less computing resources and storages. After assembly, the per-base accuracy is about 98-99.8%, to further improve single base accuracy, please use NextPolish.
We benchmarked NextDenovo against other assemblers using Oxford Nanopore long reads from human and Drosophila melanogaster, and PacBio continuous long reads (CLR) from Arabidopsis thaliana.
SequenceBouncer: A method to remove outlier entries from a multiple sequence alignment
Location and version:
1 2 3 4 5 6 7 8 9 10 11 $ which SequenceBouncer.py /local/cluster/bin/SequenceBouncer.py $ SequenceBouncer.py --help SequenceBouncer: A method to remove outlier entries from a multiple sequence alignment Cory Dunn University of Helsinki cory.dunn@helsinki.fi Version: 1.19 Please cite DOI: 10.1101/2020.11.24.395459 help message:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 $ SequenceBouncer.