Conda
See the ‘activating the conda environment’ section below to access this
software.
LiftoffTools is a toolkit to compare genes lifted between genome assemblies.
Specifically it is designed to compare genes lifted over using
Liftoff although it is also compatible
with other lift-over tools such as UCSC liftOver as long as the feature IDs
are the same. LiftoffTools provides 3 different modules. The first identifies
variants in protein-coding genes and their effects on the gene. The second
compares the gene synteny, and the third clusters genes into groups of
paralogs to evaluate gene copy number gain and loss. The input for all modules
is the reference genome assembly (FASTA), target genome assembly (FASTA),
reference annotation (GFF/GTF), and target annotation (GFF/GTF).
Activating the conda environment
Check out a node with qrsh
and run:
1
2
|
bash
source /local/cluster/conda-envs/envs/liftofftools-0.4.4/activate.sh
|
And then run your commands as usual. To use over SGE, include the source line
above in a shell script prior to your commands, e.g.
1
2
3
4
|
$ cat run_liftofftools.sh
#!/usr/bin/env bash
source /local/cluster/conda-envs/envs/liftofftools-0.4.4/activate.sh
liftofftools ...
|
And then run SGE_Batch -c 'bash ./run_liftofftools.sh' -r sge.listofftools ...
.
Making activation easier
If you have had conda set up for a while, e.g. using the instructions in this
post, then you can
run (only necessary once):
1
2
3
|
bash
conda config --append envs_dirs /local/cluster/conda-envs/envs
conda config --append pkgs_dirs /local/cluster/conda-envs/pkgs
|
and then you can activate the env using conda activate liftofftools-0.4.4
.
Location and version
1
2
3
4
|
$ bash
$ source /local/cluster/conda-envs/envs/liftofftools-0.4.4/activate.sh
$ which liftofftools
/local/cluster/conda-envs/envs/liftofftools-0.4.4/bin/liftofftools
|
help message
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
|
$ liftofftools -h
usage: liftofftools [-h] -r R -t T -rg GFF/GTF or DB -tg GFF/GTF or DB [-c] [-f F] [-infer-genes] [-dir DIR] [-force] [-mmseqs_path MMSEQS_PATH]
[-mmseqs_params =STR] [-edit-distance] [-r-sort R_SORT] [-t-sort T_SORT] [-V]
{clusters,variants,synteny,all}
Compare gene annotations across genome assemblies
Subcommands:
{clusters,variants,synteny,all}
options:
-h, --help show this help message and exit
-r R reference fasta
-t T target fasta
-rg GFF/GTF or DB reference annotation file to lift over in GFF or GTF format or gffutils database created in previous liftoff or liftofftools run
-tg GFF/GTF or DB target annotation file to lift over in GFF or GTF format or gffutils databased created in previous liftoff or liftofftools run
-c analyze protein coding gene clusters only
-f F text file with additional feature types besides genes to analyze
-infer-genes
-dir DIR output directory
-force force overwrite of output/intermediate files in -dir
-V, --version show program version
clusters arguments:
-mmseqs_path MMSEQS_PATH
mmseqs path if not in working directory or PATH
-mmseqs_params =STR space delimited list of additional mmseqs parameters. Default="--min-seq-id 0.9 -c 0.9"
synteny arguments:
-edit-distance calculate edit distance between reference gene order and target gene order
-r-sort R_SORT txt file with the order of the reference chromosomes to be plotted on the x-axis
-t-sort T_SORT txt file with the order of the target chromosomes to be plotted on the y-axis
(liftofftools-0.4.4) [cgrbinst@chrom0 downloads]$ which liftofftools
/local/cluster/conda-envs/envs/liftofftools-0.4.4/bin/liftofftools
(liftofftools-0.4.4) [cgrbinst@chrom0 downloads]$ liftofftools -V
v0.4.3
(liftofftools-0.4.4) [cgrbinst@chrom0 downloads]$ liftofftools -h
usage: liftofftools [-h] -r R -t T -rg GFF/GTF or DB -tg GFF/GTF or DB [-c]
[-f F] [-infer-genes] [-dir DIR] [-force]
[-mmseqs_path MMSEQS_PATH] [-mmseqs_params =STR]
[-edit-distance] [-r-sort R_SORT] [-t-sort T_SORT] [-V]
{clusters,variants,synteny,all}
Compare gene annotations across genome assemblies
Subcommands:
{clusters,variants,synteny,all}
options:
-h, --help show this help message and exit
-r R reference fasta
-t T target fasta
-rg GFF/GTF or DB reference annotation file to lift over in GFF or GTF
format or gffutils database created in previous liftoff
or liftofftools run
-tg GFF/GTF or DB target annotation file to lift over in GFF or GTF format
or gffutils databased created in previous liftoff or
liftofftools run
-c analyze protein coding gene clusters only
-f F text file with additional feature types besides genes to
analyze
-infer-genes
-dir DIR output directory
-force force overwrite of output/intermediate files in -dir
-V, --version show program version
clusters arguments:
-mmseqs_path MMSEQS_PATH
mmseqs path if not in working directory or PATH
-mmseqs_params =STR space delimited list of additional mmseqs parameters.
Default="--min-seq-id 0.9 -c 0.9"
synteny arguments:
-edit-distance calculate edit distance between reference gene order and
target gene order
-r-sort R_SORT txt file with the order of the reference chromosomes to
be plotted on the x-axis
-t-sort T_SORT txt file with the order of the target chromosomes to be
plotted on the y-axis
|
software ref: https://github.com/agshumate/LiftoffTools
research ref: <>