Using dependencies in Slurm
Slurm has a fairly robust set of dependencies you can use. These are set when you submit the job and can be used for setting up pipelines. A job can depend on more than one other job as well.
To use dependencies, submit the job with the following switch. If using multiple dependency types, they should be comma seperated:
-d, --dependency=<dependency_list>
You may want to run a set of jobs sequentially, so that the second job runs only after the first one has completed. This can be accomplished using Slurm's job dependencies options. For example, if you have two jobs, job1.batch
and job2.batch
, you can utilize job dependencies as in the example below :
[user@lnode002]$ sbatch job1.batch Submitted batch job 111 [user@lnode002]$ sbatch --dependency=afterany:111 job2.batch Submitted batch job 112
The flag –dependency=afterany:111
tells the batch system to start the second job only after completion of the first job. afterany indicates that job2 will run regardless of the exit status of job1, i.e. regardless of whether the batch system thinks job1 completed successfully or unsuccessfully.
Once job 111 completes, job 112 will be released by the batch system and then will run as the appropriate nodes become available.
Exit status: The exit status of a job is the exit status of the last command that was run in the batch script. An exit status of '0' means that the batch system thinks the job completed successfully. It does not necessarily mean that all commands in the batch script completed successfully.
There are several options for the –dependency
flag that depend on the status of job1. e.g.
--dependency=afterany:job1 job2 will start after job1 completes with any exit status --dependency=after:job1 job2 will start any time after job1 starts --dependency=afterok:job1 job2 will run only if job1 completed with an exit status of 0 --dependency=afternotok:job1 job2 will run only if job1 completed with a non-zero exit status
Making a job depend on the completion of several other jobs: example below :
[user@lnode002]$ sbatch job1.batch Submitted batch job 201 [user@lnode002]$ sbatch job2.batch Submitted batch job 202 [user@lnode002]$ sbatch --dependency=afterany:201,202 job3.batch Submitted batch job 203 [user@lnode002]$ squeue -u $USER -S S,i,M -o "%12i %15j %4t %30E" JOBID NAME ST DEPENDENCY 201 job1.batch R 202 job2.batch R 203 job3.batch PD afterany:201,afterany:202
*) https://hpc.nih.gov/docs/job_dependencies.html
*) http://www.hpc.caltech.edu/documentation/faq/dependencies-and-pipelines