smalt 0.7.6

2021-10-04 490 words 3 minutes

smalt

SMALT aligns DNA sequencing reads with a reference genome.

Reads from a wide range of sequencing platforms can be processed, for example Illumina, Roche-454, Ion Torrent, PacBio or ABI-Sanger. Paired reads are supported. There is no support for SOLiD reads.

A mode for the detection of split (chimeric) reads is provided. Multi-threaded program execution is supported.

About SMALT

SMALT employs a hash index of short words up to 20 nucleotides long and sampled at equidistant steps along the reference genome. For each sequencing read, potentially matching segments in the reference genome are identified from seed matches in the index and subsequently aligned with the read using dynamic programming.

The best gapped alignments of each read are reported including a score for the reliability of the best mapping. The user can adjust the trade-off between sensitivity and speed by tuning the length and spacing of the hashed words.

Running SMALT

Mapping with SMALT involves two steps: First, a hash index has to be generated for the genomic reference sequences. Then the sequencing reads are mapped onto the reference using the index.

smalt index -k 14 -s 8 hs38_k14s8 GRCh38.fasta

builds a hash index for the human genome in the FASTA file GRCh38.fasta. Words of 14 base pair length are sampled at every 8th position in the genome. Two files hs38_k14s8.smi and hs38_k14s8.sma are written to disk.

smalt map -o mapped.sam hs38_k14s8 mates_1.fastq mates_2.fastq

loads the hash table created by the previous step into memory and maps paired-end reads in the files mates_1.fastq and mates_2.fastq. The output is written to the file mapped.sam in SAM output format.

Link to manual

Location and version:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


$ which smalt
/local/cluster/smalt/bin/smalt
$ smalt version

              SMALT - Sequence Mapping and Alignment Tool
Version: 0.7.6
Date:    21-03-2014
Author:  Hannes Ponstingl (hp3@sanger.ac.uk)

Copyright (C) 2010 - 2014 Genome Research Ltd.

help message:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25


$ smalt help

              SMALT - Sequence Mapping and Alignment Tool

SYNOPSIS:
    smalt <task> [TASK_OPTIONS] [<index_name> <file_name_A> [<file_name_B>]]

Available tasks:
    smalt check   - checks FASTA/FASTQ input
    smalt help    - prints a brief summary of this software
    smalt index   - builds an index of k-mer words for the reference
    smalt map     - maps single or paired reads onto the reference
    smalt sample  - sample insert sizes for paired reads
    smalt version - prints version information

Help on individual tasks:
    smalt <task> -H

DESCRIPTION:
  Smalt is a pairwise sequence alignment program designed for the mapping of
  DNA sequencing reads onto genomic reference sequences.
  Running the software involves two steps. First, an index of short words
  has to be built for the set of genomic reference sequences (issue
  'smalt index -H' for help). Then the sequencing reads are mapped onto the
  reference ('smalt map -H' for help).

software ref: https://www.sanger.ac.uk/tool/smalt-0/
software ref: http://sourceforge.net/projects/smalt/
research ref: