Hapo-G - Haplotype-Aware Polishing of Genomes
Hapo-G (pronounced like apogee) is a tool that aims to improve the quality of
genome assemblies by polishing the consensus with accurate reads.
Activating the conda environment
1
2
|
bash
source /local/cluster/hapog/activate.sh
|
Add the above source
command to your shell scripts to use this program in
SGE submissions.
Location:
1
2
|
$ which hapog.py
/local/cluster/hapog/bin/hapog.py
|
help message:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
$ hapog.py -h
usage: hapog [-h] --genome INPUT_GENOME [--pe1 PE1] [--pe2 PE2] [--single LONG_READS] [-b BAM_FILE] [-u]
[--output OUTPUT_DIR] [--threads THREADS] [--bin HAPOG_BIN]
Hapo-G uses alignments produced by BWA (or any other aligner that produces SAM files) to polish the consensus of agenome assembly.
options:
-h, --help show this help message and exit
Mandatory arguments:
--genome INPUT_GENOME, -g INPUT_GENOME
Input genome file to map reads to
--pe1 PE1 Fastq.gz paired-end file (pair 1, can be given multiple times)
--pe2 PE2 Fastq.gz paired-end file (pair 2, can be given multiple times)
--single LONG_READS Use long reads instead of short reads (can only be given one time, please concatenate all read files into one)
Optional arguments:
-b BAM_FILE Skip mapping step and provide a sorted bam file
-u Include unpolished sequences in final output
--output OUTPUT_DIR, -o OUTPUT_DIR
Output directory name
--threads THREADS, -t THREADS
Number of threads (used in BWA, Samtools and Hapo-G)
--bin HAPOG_BIN Use a different Hapo-G binary (for debug purposes)
|
software ref: https://github.com/institut-de-genomique/HAPO-G
research ref: https://doi.org/10.1093/nargab/lqab034