# Liftofftools 0.4.4 {{< admonition tip "Conda" true >}} See the 'activating the conda environment' section below to access this software. {{< /admonition >}} ## liftofftools-0.4.4 LiftoffTools is a toolkit to compare genes lifted between genome assemblies. Specifically it is designed to compare genes lifted over using [Liftoff](https://github.com/agshumate/Liftoff) although it is also compatible with other lift-over tools such as UCSC liftOver as long as the feature IDs are the same. LiftoffTools provides 3 different modules. The first identifies variants in protein-coding genes and their effects on the gene. The second compares the gene synteny, and the third clusters genes into groups of paralogs to evaluate gene copy number gain and loss. The input for all modules is the reference genome assembly (FASTA), target genome assembly (FASTA), reference annotation (GFF/GTF), and target annotation (GFF/GTF). ------------------------------------------------------------------------------- ## Activating the conda environment Check out a node with `qrsh` and run: ```console bash source /local/cluster/conda-envs/envs/liftofftools-0.4.4/activate.sh ``` And then run your commands as usual. To use over SGE, include the source line above in a shell script prior to your commands, e.g. ```bash $ cat run_liftofftools.sh #!/usr/bin/env bash source /local/cluster/conda-envs/envs/liftofftools-0.4.4/activate.sh liftofftools ... ``` And then run `SGE_Batch -c 'bash ./run_liftofftools.sh' -r sge.listofftools ...`. ### Making activation easier If you have had conda set up for a while, e.g. using the instructions in [this post](../../tips/posts/using-the-system-miniconda-3-install/), then you can run (only necessary once): ```console bash conda config --append envs_dirs /local/cluster/conda-envs/envs conda config --append pkgs_dirs /local/cluster/conda-envs/pkgs ``` and then you can activate the env using `conda activate liftofftools-0.4.4`. ## Location and version ```console $ bash $ source /local/cluster/conda-envs/envs/liftofftools-0.4.4/activate.sh $ which liftofftools /local/cluster/conda-envs/envs/liftofftools-0.4.4/bin/liftofftools ``` ## help message ```console $ liftofftools -h usage: liftofftools [-h] -r R -t T -rg GFF/GTF or DB -tg GFF/GTF or DB [-c] [-f F] [-infer-genes] [-dir DIR] [-force] [-mmseqs_path MMSEQS_PATH] [-mmseqs_params =STR] [-edit-distance] [-r-sort R_SORT] [-t-sort T_SORT] [-V] {clusters,variants,synteny,all} Compare gene annotations across genome assemblies Subcommands: {clusters,variants,synteny,all} options: -h, --help show this help message and exit -r R reference fasta -t T target fasta -rg GFF/GTF or DB reference annotation file to lift over in GFF or GTF format or gffutils database created in previous liftoff or liftofftools run -tg GFF/GTF or DB target annotation file to lift over in GFF or GTF format or gffutils databased created in previous liftoff or liftofftools run -c analyze protein coding gene clusters only -f F text file with additional feature types besides genes to analyze -infer-genes -dir DIR output directory -force force overwrite of output/intermediate files in -dir -V, --version show program version clusters arguments: -mmseqs_path MMSEQS_PATH mmseqs path if not in working directory or PATH -mmseqs_params =STR space delimited list of additional mmseqs parameters. Default="--min-seq-id 0.9 -c 0.9" synteny arguments: -edit-distance calculate edit distance between reference gene order and target gene order -r-sort R_SORT txt file with the order of the reference chromosomes to be plotted on the x-axis -t-sort T_SORT txt file with the order of the target chromosomes to be plotted on the y-axis (liftofftools-0.4.4) [cgrbinst@chrom0 downloads]$ which liftofftools /local/cluster/conda-envs/envs/liftofftools-0.4.4/bin/liftofftools (liftofftools-0.4.4) [cgrbinst@chrom0 downloads]$ liftofftools -V v0.4.3 (liftofftools-0.4.4) [cgrbinst@chrom0 downloads]$ liftofftools -h usage: liftofftools [-h] -r R -t T -rg GFF/GTF or DB -tg GFF/GTF or DB [-c] [-f F] [-infer-genes] [-dir DIR] [-force] [-mmseqs_path MMSEQS_PATH] [-mmseqs_params =STR] [-edit-distance] [-r-sort R_SORT] [-t-sort T_SORT] [-V] {clusters,variants,synteny,all} Compare gene annotations across genome assemblies Subcommands: {clusters,variants,synteny,all} options: -h, --help show this help message and exit -r R reference fasta -t T target fasta -rg GFF/GTF or DB reference annotation file to lift over in GFF or GTF format or gffutils database created in previous liftoff or liftofftools run -tg GFF/GTF or DB target annotation file to lift over in GFF or GTF format or gffutils databased created in previous liftoff or liftofftools run -c analyze protein coding gene clusters only -f F text file with additional feature types besides genes to analyze -infer-genes -dir DIR output directory -force force overwrite of output/intermediate files in -dir -V, --version show program version clusters arguments: -mmseqs_path MMSEQS_PATH mmseqs path if not in working directory or PATH -mmseqs_params =STR space delimited list of additional mmseqs parameters. Default="--min-seq-id 0.9 -c 0.9" synteny arguments: -edit-distance calculate edit distance between reference gene order and target gene order -r-sort R_SORT txt file with the order of the reference chromosomes to be plotted on the x-axis -t-sort T_SORT txt file with the order of the target chromosomes to be plotted on the y-axis ``` software ref: research ref: <>