Table of contents
Job submission overview
You can use SLURM in different ways to submit jobs:
- Interactive mode: see here for an example.
- GUI applications: see here for an example
sbatch
scripts: see below for an example.
SBATCH scripting guide
See the following example with annotations.
#!/bin/bash
#SBATCH -N 1 # number of requested nodes. Set to 1 unless needed.
#SBATCH -n 1 # number of tasks to run. Set to 1 unless needed. See granular resource allocation below for example.
#SBATCH -c 1 # number of requested CPUs
#SBATCH --mem=10g # amount of memory requested (g=gigabytes)
#SBATCH -p qTRD # partition to run job on. See "cluster and queue information" page for more information.
#SBATCH -t 1440 # time in minutes. After set time the job will be cancelled. See "cluster and queue information" page for limits.
#SBATCH -J <job name>
#SBATCH -e error%A.err # errors will be written to this file. If saving this file in a separate folder, make sure the folder exists, or the job will fail
#SBATCH -o out%A.out # output will be written to this file. If saving this file in a separate folder, make sure the folder exists, or the job will fail
#SBATCH -A <slurm_account_code> # user group. See "requesting an account" page for list of groups
#SBATCH --mail-type=ALL # types of emails to send out. See SLURM documentation for more possible values
#SBATCH --mail-user=<email address> # set this email address to receive updates about the job
#SBATCH --oversubscribe # see SLURM documentation for explanation
# it is a good practice to add small delay at the beginning and end of the job- helps to preserve stability of SLURM controller when large number of jobs fail simultaneously
sleep 10s
# for debugging purpose- in case the job fails, you know where to look for possible cause
echo $HOSTNAME >&2
# run the actual job
module load matlab/R2022a
matlab -batch 'simple_example'
# delay at the end (good practice)
sleep 10s
Save the above script in your /data/user*/<your name>
directory as JobSubmit.sh
and modify as needed. Then submit the job using the following command:
$ sbatch JobSubmit.sh
Then use the commands in the following controlling jobs section to keep tabs on the job.
Controlling jobs
The following commands can be run on the login node:
# Check status of all jobs
$ squeue
# Check status of jobs by user
$ squeue -u `<campusID>
# Continuously check status of jobs
$ watch -n 10 squeue -u `<campusID>
# Check job status by ID
$ squeue -j `<jobID>
# Cancel job by ID
$ scancel `<jobID>
Multiple jobs with job arrays
You can submit a number of identical jobs (e.g. fMRI preprocessing) using the SLURM job array feature (https://slurm.schedmd.com/job_array.html). The number or maximum jobs that can be submitted is 5000. Please see this example if you need to submit more than that.
Submitting job arrays
# submit 100 jobs, run 10 at a time. Please use a sensible limit and leave resources for the others.
$ sbatch --array=1-100%10 JobSubmit.sh
# submit 100 jobs without limitation
$ sbatch --array=1-100 JobSubmit.sh
# run particular tasks
$ sbatch --array=2,4,8,16,32,65 JobSubmit.sh
Modifying job arrays
Running the following command will update the task limit (e.g. %10
in the above section) of a running array.
$ scontrol update ArrayTaskThrottle=<count> JobId=<jobID>
Input and output files
You can add/edit the following in the job submission script so that %A
and %a
will be replaced by the job ID and the array task ID, respectively.
#SBATCH --output=out%A_%a.out
#SBATCH --error=error%A_%a.err
The array ID index
SLURM provides an environment variable $SLURM_ARRAY_TASK_ID
which you can reference inside your script to control input and output. Following are some examples (adapted from here):
# read a particular input file in a folder containing *.txt files
$ file=$(ls *.txt | sed -n ${SLURM_ARRAY_TASK_ID}p)
$ myscript -i $file
# read a particular line from an input file containing a list of IDs
ID_LIST=($(<input.csv))
ID=${ID_LIST[${SLURM_ARRAY_TASK_ID}]}
# use array ID in a python script
> import sys
> task_id = sys.getenv('SLURM_ARRAY_TASK_ID')
# use array ID in a Matlab script
> task_id = getenv('SLURM_ARRAY_TASK_ID')
# use array ID in an R script
> task_id <- Sys.getenv("SLURM_ARRAY_TASK_ID")
Sample SBATCH scripts
Please see Example SLURM scripts