# busco 5.2.2 ## BUSCOv5 - Benchmarking sets of Universal Single-Copy Orthologs. For full documentation please consult the user guide: https://busco.ezlab.org/busco_userguide.html Main changes in v5: - Metaeuk is used as default gene predictor for eukaryote pipeline. Augustus is maintained and can be used optionally instead of Metaeuk. - Introduction of batch mode: input argument can be a folder containing input files - The folder structure has changed, so if doing a manual installation, make sure to completely remove any previous versions of BUSCO before installing v5. To activate: ```console bash source /local/cluster/busco/activate.sh ``` To use busco in SGE_Batch, include the `source /local/cluster/busco/activate.sh` line in your shell scripts before the other commands. Location and version: ```console $ which busco /local/cluster/busco/bin/busco $ busco --version BUSCO 5.2.2 ``` help message: ```console $ busco --help usage: busco -i [SEQUENCE_FILE] -l [LINEAGE] -o [OUTPUT_NAME] -m [MODE] [OTHER OPTIONS] Welcome to BUSCO 5.2.2: the Benchmarking Universal Single-Copy Ortholog assessment tool. For more detailed usage information, please review the README file provided with this distribution and the BUSCO user guide. Visit this page https://gitlab.com/ezlab/busco#how-to-cite-busco to see how to cite BUSCO optional arguments: -i SEQUENCE_FILE, --in SEQUENCE_FILE Input sequence file in FASTA format. Can be an assembled genome or transcriptome (DNA), or protein sequences from an annotated gene set. Also possible to use a path to a directory containing multiple input files. -o OUTPUT, --out OUTPUT Give your analysis run a recognisable short name. Output folders and files willbe labelled with this name. WARNING: do not provide a path -m MODE, --mode MODE Specify which BUSCO analysis mode to run. There are three valid modes: - geno or genome, for genome assemblies (DNA) - tran or transcriptome, for transcriptome assemblies (DNA) - prot or proteins, for annotated gene sets (protein) -l LINEAGE, --lineage_dataset LINEAGE Specify the name of the BUSCO lineage to be used. --augustus Use augustus gene predictor for eukaryote runs --augustus_parameters "--PARAM1=VALUE1,--PARAM2=VALUE2" Pass additional arguments to Augustus. All arguments should be contained withina single pair of quotation marks, separated by commas. --augustus_species AUGUSTUS_SPECIES Specify a species for Augustus training. --auto-lineage Run auto-lineage to find optimum lineage path --auto-lineage-euk Run auto-placement just on eukaryote tree to find optimum lineage path --auto-lineage-prok Run auto-lineage just on non-eukaryote trees to find optimum lineage path -c N, --cpu N Specify the number (N=integer) of threads/cores to use. --config CONFIG_FILE Provide a config file --datasets_version DATASETS_VERSION Specify the version of BUSCO datasets, e.g. odb10 --download [dataset [dataset ...]] Download dataset. Possible values are a specific dataset name, "all", "prokaryota", "eukaryota", or "virus". If used together with other command line arguments, make sure to place this last. --download_base_url DOWNLOAD_BASE_URL Set the url to the remote BUSCO dataset location --download_path DOWNLOAD_PATH Specify local filepath for storing BUSCO dataset downloads -e N, --evalue N E-value cutoff for BLAST searches. Allowed formats, 0.001 or 1e-03 (Default: 1e-03) -f, --force Force rewriting of existing files. Must be used when output files with the provided name already exist. -h, --help Show this help message and exit --limit N How many candidate regions (contig or transcript) to consider per BUSCO (default: 3) --list-datasets Print the list of available BUSCO datasets --long Optimization Augustus self-training mode (Default: Off); adds considerably to the run time, but can improve results for some non-model organisms --metaeuk_parameters "--PARAM1=VALUE1,--PARAM2=VALUE2" Pass additional arguments to Metaeuk for the first run. All arguments should becontained within a single pair of quotation marks, separated by commas. --metaeuk_rerun_parameters "--PARAM1=VALUE1,--PARAM2=VALUE2" Pass additional arguments to Metaeuk for the second run. All arguments should be contained within a single pair of quotation marks, separated by commas. --offline To indicate that BUSCO cannot attempt to download files --out_path OUTPUT_PATH Optional location for results folder, excluding results folder name. Default iscurrent working directory. -q, --quiet Disable the info logs, displays only errors -r, --restart Continue a run that had already partially completed. --tar Compress some subdirectories with many files to save space --update-data Download and replace with last versions all lineages datasets and files necessary to their automated selection -v, --version Show this version and exit ``` software ref: software ref: research ref: