 * A job requesting more GPUs than allowed by the QOS of the users's account (see [[#Accounts_and_limits|Accouns and limits]]) will stay in "PENDING" state.

Slurm Pilot project for Biwi

  • The following information is an abbreviated How-To with specific information for the pilot cluster
  • Our official documentation for slurm is the Computing wiki article, you need to read this as well.

  • Please bear in mind the final documentation is meant to be the above article. It will be extended with your feedback valid for all slurm users and a specific section or additional page concerning only Biwi.
  • The goal of this pilot is to have a documentation to enable SGE users to migrate their jobs to Slurm. This also means the section Accouns and limits has only informative purpose at the moment.

Pilot-specific information

Involved machines are

  • biwirender01 for CPU computing

  • biwirender03 for GPU-computing

All available GPU partitions are overlayed on biwirender03. They will be available on different nodes in the final cluster.

/!\ long partitions are not yet implemented in the pilot!

Initialising slurm

All slurm command read the cluster configuration from the environment variable SLURM_CONF, so it needs to be set:

export SLURM_CONF=/home/sladmcvl/slurm/slurm.conf

If you're interested, feel free to have a look at the configuration, feedback is welcome!

Available partitions

The equivalent to SGE's queues is called partitions in slurm.
sinfo shows all available partitions:


cpu.medium.normal    up 2-00:00:00      1   idle biwirender01
gpu.low.normal       up 2-00:00:00      1   idle biwirender03
gpu.medium.normal    up 2-00:00:00      1   idle biwirender03
gpu.medium.long      up 5-00:00:00      1   idle biwirender03
gpu.high.normal      up 2-00:00:00      1   idle biwirender03
gpu.high.long        up 5-00:00:00      1   idle biwirender03
gpu.debug            up    6:00:00      1   idle biwirender03
gpu.mon              up    6:00:00      1   idle biwirender03

Only interactive partitions gpu.debug and gpu.monitor can and should be specified (see below). The scheduler decides in which partition to put a job based on the resources requested by it.

Interactive jobs

For testing purposes a job with an interactive session with 1 GPU can be started:

srun --time 10 --partition=gpu.debug --gres=gpu:1 --pty bash -i
  • Such jobs are placed in gpu.debug by the scheduler

To monitor a running job, an interactive session can be started with explicitly selecting the monitoring partition. The node where the batch job is running needs to be specified as well:

srun --time 10 --partition=gpu.mon --nodelist=biwirender03 --pty bash -i
  • Allocating GPU resources is prohibited for such interactive jobs
  • It may be possible to attach an interactive session to an already running job, which will make the above obsolete. This is still under investigation at the moment.

Allocating resources


For a job to have access to a GPU, GPU resources need to be requested with the option --gres=gpu:<n>
Here's the sample job submission script primes_1GPU.sh requesting 1 GPU:

#SBATCH  --mail-type=ALL
#SBATCH  --gres=gpu:1
#SBATCH  --output=log/%j.out
export LOGFILE=`pwd`/log/$SLURM_JOB_ID.out
# env | grep SLURM_ #Uncomment this line to show environment variables set by slurm for a job
# binary to execute
codebin/primes $1
echo ""
echo "Job statistics: "
sstat -j $SLURM_JOB_ID --format=JobID,AveVMSize%15,MaxRSS%15,AveCPU%15
echo ""
exit 0;
  • Make sure the directory wherein to store logfiles exists before submitting a job.
  • Please keep the environment variable LOGFILE, it is used in the scheduler's epilog script to append information to your logfile after your job ended (and therefore doesn't have access to $SLURM_JOB_ID anymore).

  • slurm also sets CUDA_VISIBLE_DEVICES. See the section GPU jobs in the main slurm article.

  • A job requesting more GPUs than allowed by the QOS of the users's account (see Accouns and limits) will stay in "PENDING" state.


If you omit the --mem option, the default of 30G/GPU memory and 3CPUs/GPU will be allocated to your job, which will make the scheduler choose gpu.medium.normal:

sbatch primes_1GPU.sh
sbatch: GRES requested     : gpu:1
sbatch: GPUs requested     : 1
sbatch: Requested Memory   : ---
sbatch: CPUs requested     : ---
sbatch: Your job is a gpu job.
Submitted batch job 133

squeue --Format jobarrayid:8,partition:20,reasonlist:20,username:10,tres-alloc:45,timeused:10

JOBID   PARTITION           NODELIST(REASON)    USER      TRES_ALLOC                                   TIME
133     gpu.medium.normal   biwirender03        testuser  cpu=3,mem=30G,node=1,billing=3,gres/gpu=1    0:02

An explicit --mem option selects the partition as follows:



< 30G


30G - 50G


>50G - 70G



not allowed

For example with:

sbatch --mem=50G primes_2GPU.sh

the above squeue command shows:

JOBID   PARTITION           NODELIST(REASON)    USER      TRES_ALLOC                                   TIME
136     gpu.high.normal     biwirender03        testuser  cpu=6,mem=100G,node=1,billing=6,gres/gpu=2   0:28

Accounts and limits

In slurm lingo an account is equivalent to a user group. The following accounts are configured for users to be added to:

sacctmgr show account

   Account                Descr                  Org
---------- -------------------- --------------------
  deadconf  deadline_conference                 biwi
  deadline             deadline                 biwi
       isg                  isg                  isg
      root default root account                 root
     staff                staff                 biwi
   student              student                 biwi
  • Accounts isg and root are not accessible to Biwi

GPU limits are stored in so-called QOS, each account is associated with the QOS we want to apply to it. Limits apply to all users added to an account.

sacctmgr show assoc format=account%15,user%15,partition%15,maxjobs%8,qos%15,defaultqos%15

        Account            User       Partition  MaxJobs             QOS         Def QOS
--------------- --------------- --------------- -------- --------------- ---------------
           root                                                   normal
           root            root                                   normal
       deadconf                                                    gpu_4           gpu_4
       deadline                                                    gpu_3           gpu_3
       deadline        ........                                    gpu_3           gpu_3
            isg                                                   normal
            isg        sladmall                                   normal
          staff                                                    gpu_2           gpu_2
          staff        ........                                    gpu_2           gpu_2
          staff        ........                                    gpu_2           gpu_2
          staff        ........                                    gpu_2           gpu_2
          staff        ........                                    gpu_2           gpu_2
          staff        ........                                    gpu_2           gpu_2
        student                                                    gpu_1           gpu_1

The QOS' gpu_x only contain a limit for the amount of GPUs per user:

sacctmgr show qos format=name%15,maxtrespu%30

          Name                      MaxTRESPU
--------------- ------------------------------
          gpu_1                     gres/gpu=1
          gpu_2                     gres/gpu=2
          gpu_3                     gres/gpu=3
          gpu_4                     gres/gpu=4
          gpu_5                     gres/gpu=5
          gpu_6                     gres/gpu=6

Users with administrative privileges can move a user between accounts deadline or deadconf.

List associations of testuser:

sacctmgr show assoc where user=testuser format=account%15,user%15,partition%15,maxjobs%8,qos%15,defaultqos%15

        Account            User       Partition  MaxJobs             QOS         Def QOS
--------------- --------------- --------------- -------- --------------- ---------------
       deadline        testuser                                    gpu_3           gpu_3

Move testuser from deadline to staff:

/home/sladmcvl/slurm/change_account_of_user.sh testuser deadline staff

List associations of testuser again:

sacctmgr show assoc where user=testuser format=account%15,user%15,partition%15,maxjobs%8,qos%15,defaultqos%15

        Account            User       Partition  MaxJobs             QOS         Def QOS
--------------- --------------- --------------- -------- --------------- ---------------
          staff        testuser                                    gpu_2           gpu_2

Accounts with administrative privileges can be shown with:

sacctmgr show user format=user%15,defaultaccount%15,admin%15'

Last words

Have fun using SLURM for your jobs!

