# Guidance on using the CQLS queuing system # The CQLS queuing system Queuing systems allow for multiple users to make use of multiple compute nodes across an infrastructure. At the CQLS, we log in to the `shell.cqls.oregonstate.edu` machine, currently called `vaughan`. From `vaughan`, users can submit batch or array jobs (1 or many tasks, respectively) to the, queuing system such that they run autonomously, and will continue running even when users log off. ## Reading pre-requisite Please see this [previous post](../the-cgrb-infrastructure-and-you) with some information regarding the CQLS infrastructure along with some of the commands that are a part of the SGE queuing system. ## How does one use the queuing system? The CQLS is currently using the Son of Grid Engine (SGE) queuing system. To facilitate the use of the queuing system, the CQLS wrote and maintains two helper scripts for job submission: `SGE_Batch` and `SGE_Array`. `SGE_Batch` is designed to support a job submission with individual tasks (one can submit array jobs using `SGE_Batch`, but we recommend using `SGE_Array`), while `SGE_Array` is designed for submitting array jobs including multiple tasks. These wrapper scripts write a shell script to the filesystem and submit a job to the queuing system for you according to the options you requested. In the background, the scripts call the `qsub` command for you with the shell script. {{< image src="images/SGE_Avail.png" alt="Explanation of SGE_Avail command" title="CQLS Queuing system 1" >}} To see what resources are available to you, use the `SGE_Avail` script. To monitor currently running jobs, use `qstat`, and to kill jobs, use `qdel`. See the above post linked above for more information regarding these underlying commands. To check out a node interactively, use the `qrsh` command. You can specify a `-q QUEUE` where `QUEUE` is the name of the queue you'd like to check out a node from. You can check out multiple processors on a machine using the `qrsh -pe thread N` option, where N is the number of processors you need. To land on a specific node on a queue, use the `-q QUEUE@node` syntax, where `QUEUE` is e.g. `bpp` and node is e.g. `symbiosis`. This syntax can be used with `qrsh` and the `SGE_Array` and `SGE_Batch` helper scripts. Please make sure to exit the node when you are finished so that the queue can reclaim the processors for others to use. ## Practical application We've established what the queuing system is and what commands one can use to interact with the queuing system. Let's put these scripts to use in some hypothetical scenarios that may be related to your own research. ### Common considerations #### Naming your jobs (run-name) You must specify a run-name (`-r` flag) for each job that you submit. I suggest using a syntax whereby each job submitted to SGE has a run-name prefix with `sge.`; while using the infrastructure for 10+ years, I have had to find different ways to deal with having these run directories in my various workflows. What I found is that having the directories always prefixed with `sge.` allows me to identify them even in a directory full of other input and output files. Additionally, if one uses sample names as part of the run-names, one might find that there are errors whenever a sample name starts with a number. This is because numbers are disallowed in the run-names by SGE. Therefore, prefixing all run-names with `sge.` will alleviate this issue. I also generally suggest including the program that one is running in the run-name. Therefore, a standard syntax that one might adopt is `sge.ProgramName_SampleName`. This will also help identify what jobs are running more easily when one runs `qstat`, although the names sometimes do get cut-off in that output. You can run the `qstat-long` command to get the full names for jobs. When you submit a job, it might be useful to use the `watch qstat` command to monitor the job progress until your job enters the `r` (i.e. `r`unning) state. #### Matching command settings and SGE settings SGE does not and cannot know what settings you applied to your commands in terms of CPU and/or memory utilization. Therefore, whenever you are submitting a command to the queue, it is a good idea to set your number of CPUs or threads (often using the `--threads` or `-t` option, this will be program dependent) and the `-P` option of `SGE_Batch` or `SGE_Array`. Setting a higher value of `-P` will not make your program automatically use more threads. Furthermore, setting the number of threads of your program to a higher value without the corresponding change to `-P` will cause the server to be overloaded. #### Standard output and standard error Your standard output and error will end up in the run-name directory that you specify on job creation. The STDOUT will be in the files with the `.o$RUNID` suffix and the STDERR will be in the files with the `.e$RUNID` suffix. Generally, it's useful to specify your output to go to a file, either using STDOUT redirection (i.e. `command > output.txt`) or using the program specific flag (e.g. `command -o output.txt`) instead of pulling the output from the `.o$RUNID` files. #### Use the local /data drives on compute nodes In order to speed up computation and reduce the network congestion, it can be useful to specify the local `/data` drive as input or output for your commands. If you are running a command from a `/nfs` or `/home` directory, then both the reads and writes to disk will happen over the network. With processes that are especially heavy on input/output (I/O), this can cause slowdowns and disk waits that will reduce efficiency of both your own processing and others using the same compute nodes as you. One common high I/O process is genome assembly using short reads, like the [SPAdes genome assembler](http://cab.spbu.ru/software/spades/). The authors of this software suggest using local hard drives, like `/data`, for processing using SPAdes. At minimum, it will be useful to specify the output directory on the `/data` drive. For the most speed-up, you can also copy the input files to the `/data` drive as well, prior to assembly job submssion. **NOTE:** You will have to remember which node you copied the reads over to and the output was written to so that you can copy the outputs off afterward. Some programs also allow you to specify a temporary directory to write intermediates to, and do not always use the `$TMPDIR` environment variable as default, so it can be useful to specify the temp dir as `/data` manually when that option is available. #### SGE job states listed using qstat The common SGE job states are: | **Category** | **State** | **SGE Letter Code** | | ------------- |:-----------------------------------------------| :------------------ | | Pending | pending, queue wait | qw | | Pending | pending, holding (for other job) queue wait | hqw | | Running | running | r | | Running | transferring | t | | Suspended | suspended | s | | Error | all pending states with error | Eqw, Ehqw | | Deleted | all running and suspended states with deletion | dr, dt | The most commonly seen states will be `qw` of queue wait and `r` of running. At times, you might find an `E` state (along with other codes) that indicates some issue with job submission. You can run `qstat -j $JOBID` for that job and figure out why that job has an error (`grep` for `error` and you will find the appropriate line). Sometimes you can use the `clear_Eqw_job.sh` script to clear the errors and the will continue as submitted, but often you have to kill and resubmit the jobs. If you find jobs stuck in the `d` state, the node you are trying to delete the job on is likely waiting for disk I/O and is unresponsive. The node may need to be restarted. #### SGE_Avail queue states | **State** | **SGE Letter Code** | **Note** | | :---------------- | :------------------ | :--------------------------------------------------------- | | **a**larm | a | Too many resources being used (load average too high) | | **u**nreachable | u | Machine is offline or off network | | **E**rror | E | Transfer and/or network error most often | | normal | normal | All is well | If a queue is in error state, you can run `qstat -explain E` to find more information about the error. If you see `au` or `a` state, you can run `qstat -explain a` (or `qstat -explain aE` for both). The `-explain` flag can be limited by `-q` flag as well to get information about specific queues. `qstat -explain aE | grep -B 1 -e alarm -e ERROR` shows all `E` and `a` state messages. ### Individual job submission {{< image src="images/SGE_Batch.png" alt="Explanation of SGE_Batch command" title="CQLS Queuing system 2" >}} There are several different ways to successfully submit a job through `SGE_Batch`. The two most common methods are: 1. Include the bare command in quotes as the `-c` option to `SGE_Batch` (as above) 2. Write the commands in a bash script (e.g. `run.sh`) and submit the bash script as the `-c` option (e.g. `SGE_Batch -c 'bash ./run.sh' ...`) For one-off commands, I think the utility of the bare `-c` command is obvious because of the ease of use and quick access. For jobs as part of a larger analysis, it may be useful to include the commands in a bash script, thus saving a record of what you ran and making it easier to re-run the commands in the future. In addition, if you are on a node interactively, using `qrsh`, you can run the command in the bash script without having to retype the command or scroll through your command history. An additional benefit of using a shell script to store your commands is that it is easier to run commands serially, e.g. ```bash #!/usr/bin/env bash program1 -options > outputA program2 -input ouptutA -output outputB cat outputB | sed 's/foo/bar/' | program3 -options ... ``` when compared to running several `SGE_Batch -c 'command'...` after the previous job finishes. Finally, if you need to use a piece of software that is in a conda environment, you can include the commands to activate the environment in the bash script. [Reference updates page](../../../updates/bakta-1.5.0) ```bash #!/usr/bin/env bash source /local/cluster/bakta/activate.sh bakta input.fasta ... ``` In this way, you can record what conda environment needs to be activated for the program to work properly. ### Multiple job (task) submission {{< image src="images/SGE_Array.png" alt="Explanation of SGE_Array command" title="CQLS Queuing system 3" >}} As with many workflows, you may have multiple files to process through the same program. This will be the case with any project where you have multiple samples. Instead of submitting multiple (thousand) `SGE_Batch` commands, it can be useful to organize your job submissions using `SGE_Array` instead. `SGE_Array` allows you to submit a single job that includes mutliple tasks (in SGE terminology) that will be grouped. This can be especially useful when doing some sort of large comparative genomics project with thousands of individual tasks - you can group the tasks in a single job and not take over the entire infrastructure by using the concurrency flag (`-b`; 50 by default). As a general rule, if you are using a lot of resources per job, you can limit your concurrency to a smaller number so others can use the infrastructure. Conversely, if your jobs finish quickly and are unlikely to stay in the queue for a long period (say, > 24h), then you can increase the concurrency flag to speed up the processing. Let's assume you have some predicted proteomes (NCBI .faa format) and you want to characterize their predicted functions using interproscan. Here is one way you could run them all through using `SGE_Array`. Here is our directory structure: ```console $ lsd --tree  . ├──  faa │ ├──  EQ1999.faa │ ├──  EQ2001.faa │ ├──  RJ10.faa │ ├──  RJ19.faa │ ├──  TZD22.faa │ └──  TZD59.faa └──  run.sh ``` Here is the contents of `run.sh`: ```bash #!/usr/bin/env bash iprscan=/local/cluster/interproscan/interproscan/interproscan.sh threads=8 appl='-appl TIGRFAM,Pfam,TMHMM,SignalP_GRAM_NEGATIVE' options="-T /data -pa -iprlookup -goterms -cpu $threads" for faa in $( ls -1 faa ); do echo $iprscan $appl $options -i $faa done ``` You'll note that this just runs `echo` to print the commands to the terminal, which is a good thing because it allows you to check your command before you submit it. For example, while writing this tutorial, I neglected to include the `-i $faa` at the end of the command such that I did not provide any input files to the command. Let's give the script a try: ```console $ bash ./run.sh /local/cluster/interproscan/interproscan/interproscan.sh -appl TIGRFAM,Pfam,TMHMM,SignalP_GRAM_NEGATIVE -T /data -pa -iprlookup -goterms -cpu 8 -i EQ1999.faa /local/cluster/interproscan/interproscan/interproscan.sh -appl TIGRFAM,Pfam,TMHMM,SignalP_GRAM_NEGATIVE -T /data -pa -iprlookup -goterms -cpu 8 -i EQ2001.faa /local/cluster/interproscan/interproscan/interproscan.sh -appl TIGRFAM,Pfam,TMHMM,SignalP_GRAM_NEGATIVE -T /data -pa -iprlookup -goterms -cpu 8 -i RJ10.faa /local/cluster/interproscan/interproscan/interproscan.sh -appl TIGRFAM,Pfam,TMHMM,SignalP_GRAM_NEGATIVE -T /data -pa -iprlookup -goterms -cpu 8 -i RJ19.faa /local/cluster/interproscan/interproscan/interproscan.sh -appl TIGRFAM,Pfam,TMHMM,SignalP_GRAM_NEGATIVE -T /data -pa -iprlookup -goterms -cpu 8 -i TZD22.faa /local/cluster/interproscan/interproscan/interproscan.sh -appl TIGRFAM,Pfam,TMHMM,SignalP_GRAM_NEGATIVE -T /data -pa -iprlookup -goterms -cpu 8 -i TZD59.faa ``` So far so good. Now we can submit the jobs using `SGE_Array`, which takes STDIN as the commands as default. ```console $ bash ./run.sh | SGE_Array -q fast -P 8 Successfully submitted job 9999617.1-6:1, logging job number, timestamp, and rundir to .sge_array_jobnums $ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 9999617 0.51686 j2022-11-0 davised r 11/08/2022 00:38:43 fast@chrom1.cgrb.oregonstate.l 8 1 9999617 0.51686 j2022-11-0 davised r 11/08/2022 00:38:43 fast@chrom1.cgrb.oregonstate.l 8 2 9999617 0.51686 j2022-11-0 davised r 11/08/2022 00:38:43 fast@chrom1.cgrb.oregonstate.l 8 3 9999617 0.51686 j2022-11-0 davised r 11/08/2022 00:38:43 fast@chrom1.cgrb.oregonstate.l 8 4 9999617 0.51686 j2022-11-0 davised r 11/08/2022 00:38:43 fast@chrom1.cgrb.oregonstate.l 8 5 9999617 0.51686 j2022-11-0 davised r 11/08/2022 00:38:43 fast@chrom1.cgrb.oregonstate.l 8 6 ``` One nice feature of `SGE_Array` is that you will get an automatic run-name if you do not provide a run-name with the `-r` flag. The downside is you will have to examine the `command.N` files inside the directory to figure out what you ran. Let's take a look at the SGE directory: ```console $ lsd --tree j2022-11-08_00-38-39_interproscansh_etal  j2022-11-08_00-38-39_interproscansh_etal ├──  command.1.txt ├──  command.2.txt ├──  command.3.txt ├──  command.4.txt ├──  command.5.txt ├──  command.6.txt ├──  commands.txt ├──  j2022-11-08_00-38-39_interproscansh_etal.e9999617.1 ├──  j2022-11-08_00-38-39_interproscansh_etal.e9999617.2 ├──  j2022-11-08_00-38-39_interproscansh_etal.e9999617.3 ├──  j2022-11-08_00-38-39_interproscansh_etal.e9999617.4 ├──  j2022-11-08_00-38-39_interproscansh_etal.e9999617.5 ├──  j2022-11-08_00-38-39_interproscansh_etal.e9999617.6 ├──  j2022-11-08_00-38-39_interproscansh_etal.o9999617.1 ├──  j2022-11-08_00-38-39_interproscansh_etal.o9999617.2 ├──  j2022-11-08_00-38-39_interproscansh_etal.o9999617.3 ├──  j2022-11-08_00-38-39_interproscansh_etal.o9999617.4 ├──  j2022-11-08_00-38-39_interproscansh_etal.o9999617.5 ├──  j2022-11-08_00-38-39_interproscansh_etal.o9999617.6 ├──  j2022-11-08_00-38-39_interproscansh_etal.pe9999617.1 ├──  j2022-11-08_00-38-39_interproscansh_etal.pe9999617.2 ├──  j2022-11-08_00-38-39_interproscansh_etal.pe9999617.3 ├──  j2022-11-08_00-38-39_interproscansh_etal.pe9999617.4 ├──  j2022-11-08_00-38-39_interproscansh_etal.pe9999617.5 ├──  j2022-11-08_00-38-39_interproscansh_etal.pe9999617.6 ├──  j2022-11-08_00-38-39_interproscansh_etal.po9999617.1 ├──  j2022-11-08_00-38-39_interproscansh_etal.po9999617.2 ├──  j2022-11-08_00-38-39_interproscansh_etal.po9999617.3 ├──  j2022-11-08_00-38-39_interproscansh_etal.po9999617.4 ├──  j2022-11-08_00-38-39_interproscansh_etal.po9999617.5 ├──  j2022-11-08_00-38-39_interproscansh_etal.po9999617.6 └──  j2022-11-08_00-38-39_interproscansh_etal.sh ``` Each of the `command.N.txt` files corresponds to the commands taken in as input on the commandline when you submit the `SGE_Array` command (all of which are found in the `commands.txt` file). The `jobname.oJOBID.N` and `jobname.eJOBID.N` files corresponds to the stdout and stderr of each task, respectively. ## Advanced topics While you can get by with the above commands and protocols, there are other ways that one can use the queuing system as well. ### bash scripting for automation One method for automating the job submission process takes two distinct steps: 1. Writing a shell script that includes the program, options, env vars, and activating conda envs 2. Writing a shell script to pass all input files through the script using SGE. e.g. ```bash $ cat run_bakta.sh #!/usr/bin/env bash source /local/cluster/bakta/activate.sh fasta=$1 threads=16 prefix=${fasta%.fasta} bakta --output $prefix --threads $threads $fasta ``` ```bash $ cat submit_bakta.sh #!/usr/bin/env bash for fasta in $( ls -1 fna ); do echo bash ./run_bakta.sh $fasta done ``` Then, to submit the commands to SGE `bash ./submit_bakta.sh | SGE_Array -P 16 -q ...` or, to preserve a record of the commands ```console bash ./submit_bakta.sh > cmds.txt SGE_Array -c cmds.txt -P 16 -q ... ``` or, if you don't need to run multiple jobs at once, and can run them serially ```console bash ./submit_bakta.sh > cmds.txt SGE_Batch -c 'bash ./cmds.txt' -P 16 -q ... ``` or... ```console bash ./submit_bakta.sh > cmds.txt SGE_Array -c cmds.txt -P 16 -b 1 -q ... ``` Using this two step approach is flexible, as you can see. ### snakemake pipelines Snakemake is a job running system that can be configured to submit jobs through SGE. The snakemake system is too large of a topic to cover here, however, here is some information that might get you in the right direction. default cluster config: ```console $ cat cluster.yaml __default__: threads: 1 queue: 'fast' error: 'log/cluster' output: 'log/cluster' ``` job specific config: ```console $ head config.yaml threads: 8 queue: 'bpp' ``` submission script: ```console $ cat submit.sh #!/usr/bin/env bash smk=$1 if [ $smk ]; then smk="-s $smk" fi snakemake $smk --cluster 'qsub -q {cluster.queue} -pe thread {threads} -o {cluster.output} -e {cluster.error} -cwd -S /bin/bash -V' --jobs 12 --cluster-config cluster.yaml -p --latency-wait 30 ``` ### Using the just command runner I have recently taken to using the [just](https://github.com/casey/just) command runner for my analysis, and other, jobs. One difficulty I've experienced over the years of bioinformatics work is keeping track of individual commands that I have run. For me, making each and every command as a shell script to keep a record of what I've done has not been possible or even desireable. One of the main tasks that we have to deal with is converting files between formats. While there are tools that can assist us in this, often we end up using a pipeline of `grep`, `awk`, `sed`, `cut`, and `paste` to join our commands together. While using `just` doesn't completely eliminate barriers to writing the commands in a file, it certainly has the benefit of not cluttering the working directory with shell scripts. All commands can be contained in a file called `justfile` and then can be invoked using `just $command`, as might be expected when selecting options from any program. Further, arguments can be provided (and defaults set) so that you can re-use your `justfile`s if desired. While there is some syntax and language-specific syntax to be learned (mainly, commands are submitted from the directory the `justfile` is in rather than the `pwd`), you can use `just` similarly to your standard `bash` scripts except that you get a command-line parser built in for you. #### Real world example Rather than re-writing the documentation that you can find by following the link above, I'll walk you through a scenario from today. Here is my directory structure (some files removed for readability): ```console $ lsd --tree --depth 2  . ├──  fna │ ├──  Porticoccaceae_bacterium_EAC642.fna ⇒ ../../analysis/filtered/fna/Porticoccaceae_bacterium_EAC642.fna │ ├──  Porticoccaceae_bacterium_HJW.50.fna ⇒ ../../analysis/filtered/fna/Porticoccaceae_bacterium_HJW.50.fna │ ├──  Porticoccaceae_bacterium_IN41.fna ⇒ ../../analysis/filtered/fna/Porticoccaceae_bacterium_IN41.fna │ ├──  Porticoccaceae_bacterium_MED13.fna ⇒ ../../analysis/filtered/fna/Porticoccaceae_bacterium_MED13.fna .............. ├──  fnas.txt ├──  justfile ├──  SAR92_dist.txt ├──  SAR92_dist_fixed.txt ├──  SAR92_dist_rename.txt ├──  SAR92_sizes.txt └──  sge.run-dashing-dist ├──  sge.run-dashing-dist.e9998991 ├──  sge.run-dashing-dist.o9998991 ├──  sge.run-dashing-dist.pe9998991 ├──  sge.run-dashing-dist.po9998991 └──  sge.run-dashing-dist_1_sge.sh ``` I needed to run a genome hashing algorithm so I could compare some complete and incomplete genomes and later cluster them based on the hash distance. Here is the justfile I wrote: ```console $ cat justfile #!/usr/bin/env just THREADS := '16' K := '31' OUTDIST := 'SAR92_dist.txt' OUTSIZE := 'SAR92_sizes.txt' OUTDISTR := 'SAR92_dist_rename.txt' OUTDISTF := 'SAR92_dist_fixed.txt' # set export # TMPDIR := "/data" # TMPDIR := "/tmp" _default: @just -l -f {{justfile()}} run-dist: #!/usr/bin/env bash set -euxo pipefail fnas=fnas.txt outdist={{OUTDIST}} outsize={{OUTSIZE}} threads={{THREADS}} k={{K}} dashing dist -k$k -p$threads -O$outdist -o$outsize -Q $fnas -F $fnas -T -M -J convert: #!/usr/bin/env bash outdist={{OUTDIST}} outdistr={{OUTDISTR}} outdistf={{OUTDISTF}} sed 's!fna/!!' $outdist | sed 's!.fna!!' > $outdistr paste <( echo Genome ) <( cut -f 1 $outdistr | xargs | sed 's/ /\t/g' ) > $outdistf cat $outdistr >> $outdistf ``` You'll see some parameters at the top, and then three commands, the default, which just lists the available commands, and then `run-dist` and `convert`. This is what you see when you run `just` on the command line: ```console $ just Available recipes: convert run-dist ``` The default command is hidden because it is prefixed with and underscore in the `justfile`. I was then able to submit the `run-dist` command in this way: `SGE_Batch -c 'just run-dist' -q bact -P 16 -r sge.run-dashing-dist` Since I was unfamiliar with the outputs, I used `qrsh` to check out a node and navigate to the directory where the outputs were. I realized I needed to modify the output files, and instead of running the commands individually, I wrote the `convert` command above. Then, I could just run the command using `just convert` from the commandline. Since I had a node checked out, I did not have to worry about taking up too many resources (although the convert command could easily run on vaughan as well). Additionally, I was able to re-run the `run-dist` command with several `-k` settings and examine the outputs just by changing the options in the `justfile`. One could extend this idea using the [bash scripting for automation section](#bash-scripting-for-automation) above, and have a `run-command` recipe that has the commands necesary for running the command, and a `submit-command` recipe that echos the command so it's easy to run through `SGE`; I've done this in the past and it has been helpful. You can run recipes while in other recipes, making this set up even easier.