Skip to content

Advanced Research ComputingUsing batch jobs


Slurm and job scripts

Clusters are designed to run programs in batch, and the way you specify where, how, and what to run is through a job script, also sometimes called a submit script or a Slurm script.

A job script has two parts, a preamble, which tells the scheduler and the job manager about the job, and the commands, which are what the job will actually do.

The preamble consists of directives to the scheduler. Each directive appears on a line that begins with #SBATCH and is followed by an option specification.

Commands are what you would type at a prompt to perform the computation.

The preamble

Here is an example of a simple preamble for a job. Each line will be explained below the example.

#!/bin/bash

#---------  Begin preamble
#SBATCH --job-name=test_job

#SBATCH --time=1:00:00

#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=7g

#SBATCH --account=example
#SBATCH --partition=standard

#SBATCH --mail-type=NONE
#---------  End preamble

The first option sets the job name, which will be displayed if you ask the system about your job.

The second option says you would like the job to be able to run for up to an hour. It can end sooner, and your account will only be charged for the actual running time on an ARC cluster, but if the job is still running at an hour, the job manager will stop it.

The next four options specify the resources that you want your job to have when it runs. The first of these --nodes refers to the number of physical machines on which to run. The second --ntasks-per-node refers to how many sets of things can run on each node. The third --cpus-per-task says how many cores each task will be able to use. The fourth --mem says how much memory per node you want your job to have available.

The --account option says which account will pay for this job.

The --partition option says what class of machine it will run on. On some clusters, the partition is by machine type, for example, standard nodes versus nodes with extra memory. On other clusters, it may refer to the machines owned by a particular investigator or group.

The last option `--mail-type indicates when you would like the system to send you an e-mail about your job. If you are running many jobs, NONE is often the best choice, since you are likely to be checking on progress. If only one or two jobs are being run at once, you might wish to get mail when a job completes (successfully or not). See the user guide for the cluster you are using for information about the options.

Commands

Any command that you can type at a prompt and that does not require further input from you can be put into the command section. One common command that is included is module, which is used to make software available to your job. So, for example, if your job needs to run a program that was compiled with GCC version 8.2.0, your job script might contain

module load gcc/8.2.0

We recommend that you put the my_job_header command in your job script after the preamble and any module commands but before any other commands. It will print into your output file useful information about your job that can be helpful if a problem is encountered or if you refer to output long after it was run and wish to know what the job ran with.