Skip to content

Advanced Research ComputingUsing job arrays


What is a job array?

A job array is a way to submit many jobs with a singly job script and a single sbatch command. Once the array has been submitted, one or more array jobs get created, each of which has an ID number. In a simple case, the IDs may start at 0 or 1 and increase by 1 until all the IDs have been used. But, an array can also be started with list of specific numbers to use. You can also use both.

When are arrays useful?

Arrays are useful in a few contexts.

  • If the program you are running starts with a completely random value and you need to run it N times, arrays are perfect and very easy. This applies to many Monte Carlo simulation programs.

  • If you have a list of cases, as you might with a parameter sweep, to run and the values of the parameters can be expressed as a function of the array index.

  • If you have a list of input files, then either the filenames can contain the array ID number and read directly or the filenames (or some portion of their name) can be listed in a file, one per line, and the array ID used to pick which line to read to get the filename.

Using arrays

Setting the array IDs

Only one line is needed to make a normal job script into an array job script.

#SBATCH --array=<array specifier>

The array specifier can be a range (1-10) or a list (1,3,8) or a combination of the two (1-5,8,10). There should be no spaces after the commas, as that will generate an error.

Using the array IDs

The SLURM_ARRAY_TASK_ID variable will contain the value of the current job’s array ID. This can be used from inside the job script. Here is an example that uses the array ID as part of a filename given to a script to count words.

ID=$SLURM_ARRAY_TASK_ID

wordcounter text_${ID}.txt

If the array ID were 42, then the command for that job would be

wordcounter test_42.txt

Another way to use them is to create a file that contains lines of information for your program. One of the simplest examples is for it to list subject IDs, for example, suppose we have this list in a file called subjects.lst.

sub-ctrl-18452
sub-ctrl-19613
sub-ctrl-23752
sub-exp-92316
sub-exp-74317
sub-exp-84252

We have six subjects, so we could use a job array to process each one in a different array job. The job script needs to be able to extract one line from subjects.lst to process. If we create a six-element array with

#SBATCH --array=1-6

then the array ID will be 1, 2, 3,..., 6. If each job used the corresponding line number to read the subject ID from the file, that would get us what we want. One way to do that is to set N to the array ID, read that many lines from subject.lst and then take only the last line from that group of lines. We can do that with this code,

ID=SLURM_ARRAY_TASK_ID
subjectID=$(head -${ID} subjects.lst | tail -1)

process_subject $subjectID