Hapo-G 1.3

Hapo-G - Haplotype-Aware Polishing of Genomes

Hapo-G (pronounced like apogee) is a tool that aims to improve the quality of genome assemblies by polishing the consensus with accurate reads.

Activating the conda environment

1
2
bash
source /local/cluster/hapog/activate.sh

Add the above source command to your shell scripts to use this program in SGE submissions.

Location:

1
2
$ which hapog.py
/local/cluster/hapog/bin/hapog.py

help message:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
$ hapog.py -h
usage: hapog [-h] --genome INPUT_GENOME [--pe1 PE1] [--pe2 PE2] [--single LONG_READS] [-b BAM_FILE] [-u]
             [--output OUTPUT_DIR] [--threads THREADS] [--bin HAPOG_BIN]

Hapo-G uses alignments produced by BWA (or any other aligner that produces SAM files) to polish the consensus of agenome assembly.

options:
  -h, --help            show this help message and exit

Mandatory arguments:
  --genome INPUT_GENOME, -g INPUT_GENOME
                        Input genome file to map reads to
  --pe1 PE1             Fastq.gz paired-end file (pair 1, can be given multiple times)
  --pe2 PE2             Fastq.gz paired-end file (pair 2, can be given multiple times)
  --single LONG_READS   Use long reads instead of short reads (can only be given one time, please concatenate all read files into one)

Optional arguments:
  -b BAM_FILE           Skip mapping step and provide a sorted bam file
  -u                    Include unpolished sequences in final output
  --output OUTPUT_DIR, -o OUTPUT_DIR
                        Output directory name
  --threads THREADS, -t THREADS
                        Number of threads (used in BWA, Samtools and Hapo-G)
  --bin HAPOG_BIN       Use a different Hapo-G binary (for debug purposes)

software ref: https://github.com/institut-de-genomique/HAPO-G
research ref: https://doi.org/10.1093/nargab/lqab034