Using batch jobs¶
Slurm and job scripts¶
Clusters are designed to run programs in batch, and the way you specify where, how, and what to run is through a job script, also sometimes called a submit script or a Slurm script.
A job script has two parts, a preamble, which tells the scheduler and the job manager about the job, and the commands, which are what the job will actually do.
The preamble consists of directives to the scheduler. Each directive appears
on a line that begins with #SBATCH
and is followed by an option
specification.
Commands are what you would type at a prompt to perform the computation.
The preamble¶
Here is an example of a simple preamble for a job. Each line will be explained below the example.
#!/bin/bash
#--------- Begin preamble
#SBATCH --job-name=test_job
#SBATCH --time=1:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=7g
#SBATCH --account=example
#SBATCH --partition=standard
#SBATCH --mail-type=NONE
#--------- End preamble
The first option sets the job name, which will be displayed if you ask the system about your job.
The second option says you would like the job to be able to run for up to an hour. It can end sooner, and your account will only be charged for the actual running time on an ARC cluster, but if the job is still running at an hour, the job manager will stop it.
The next four options specify the resources that you want your job to have
when it runs. The first of these --nodes
refers to the number of physical
machines on which to run. The second --ntasks-per-node
refers to how many
sets of things can run on each node. The third --cpus-per-task
says how
many cores each task will be able to use. The fourth --mem
says how
much memory per node you want your job to have available.
The --account
option says which account will pay for this job.
The --partition
option says what class of machine it will run on. On some
clusters, the partition is by machine type, for example, standard nodes versus
nodes with extra memory. On other clusters, it may refer to the machines
owned by a particular investigator or group.
The last option `--mail-type
indicates when you would like the system to
send you an e-mail about your job. If you are running many jobs, NONE
is
often the best choice, since you are likely to be checking on progress. If
only one or two jobs are being run at once, you might wish to get mail
when a job completes (successfully or not). See the user guide for the
cluster you are using for information about the options.
Commands¶
Any command that you can type at a prompt and that does not require further
input from you can be put into the command section. One common command
that is included is module
, which is used to make software available to
your job. So, for example, if your job needs to run a program that was
compiled with GCC version 8.2.0, your job script might contain
module load gcc/8.2.0
We recommend that you put the my_job_header
command in your job script after
the preamble and any module commands but before any other commands. It will
print into your output file useful information about your job that can be
helpful if a problem is encountered or if you refer to output long after it
was run and wish to know what the job ran with.