<> = DISCO Slurm information = The [[https://disco.ethz.ch/|Distributed Computing Group (DISCO)]] owns nodes in the Slurm cluster with restricted access. The following information is an addendum to the [[Services/SLURM|main Slurm article]] in this wiki specific for accessing these DISCO nodes.<
> If the information you're looking for isn't available here, please consult the [[Services/SLURM|main Slurm article]]. == Hardware == The following GPU nodes are reserved for exclusive use by DISCO: ||'''Server'''||'''CPU''' ||'''Frequency'''||'''Cores'''||'''Memory'''||'''/scratch SSD'''||'''/scratch size'''||'''GPUs''' ||'''GPU memory'''||'''GPU architecture'''||'''Operating system'''|| ||tikgpu02 ||Dual Tetrakaideca-Core Xeon E5-2680 v4 ||2.40GHz ||28 ||503 GB ||✓ ||1.1 TB ||7 Titan Xp ||12 GB ||Pascal ||Debian 11|| ||tikgpu03 ||Dual Tetrakaideca-Core Xeon E5-2680 v4 ||2.40GHz ||28 ||503 GB ||✓ ||1.1 TB ||6 Titan Xp ||12 GB ||Pascal ||Debian 11|| ||tikgpu04 ||Dual Hectakaideca-Core Xeon Gold 6242 v4||2.80GHz ||32 ||754 GB ||✓ ||1.8 TB ||8 Titan RTX ||24 GB ||Turing ||Debian 11|| ||tikgpu05 ||AMD EPYC 7742 ||3.4 GHz ||128 ||503 GB ||✓ ||7.0 TB ||5 Titan RTX<
>2 Tesla V100||24 GB<
>32 GB||Turing<
>Volta ||Debian 11|| ||tikgpu06 ||AMD EPYC 7742 ||3.4 GHz ||128 ||503 GB ||✓ ||8.7 TB ||8 RTX 3090 ||24 GB ||Ampere ||Debian 11|| ||tikgpu07 ||AMD EPYC 7742 ||3.4 GHz ||128 ||503 GB ||✓ ||8.7 TB ||8 RTX 3090 ||24 GB ||Ampere ||Debian 11|| ||tikgpu08 ||AMD EPYC 7742 ||3.4 GHz ||128 ||503 GB ||✓ ||8.7 TB ||8 RTX A6000 ||48 GB ||Ampere ||Debian 11|| ||tikgpu09 ||AMD EPYC 7742 ||3.4 GHz ||128 ||503 GB ||✓ ||8.7 TB ||8 RTX 3090 ||24 GB ||Ampere ||Debian 11|| ||tikgpu10 ||AMD EPYC 7742 ||3.4 GHz ||128 ||2015 GB ||✓ ||8.7 TB ||8 A100 ||80 GB ||Ampere ||Debian 11|| Nodes are named `tik...` for historical reasons. == Shared /scratch_net == Access to local `/scratch` of each node is available as an automount (on demand) under `/scratch_net/tikgpuNM` (Replace `NM` with an existing hostname number) on each node.<
> * ''On demand'' means: The path to a node's `/scratch` will appear at first access, like after issuing `ls /scratch_net/tikgpuNM` and disappear again when unused. * `scratch_clean` is active on local `/scratch` of all nodes, meaning older data will be deleted if space is needed. For details see the man page `man scratch_clean`. == Accounts and partitions == The nodes are grouped in partitions to prioritize access for different accounts: ||'''Partition'''||'''Nodes'''||'''Slurm accounts with access'''|| ||disco.low||tikgpu[02-04]||disco-low|| ||disco.med||tikgpu[02-07,09]||disco-med|| ||disco.all||tikgpu[02-10]||disco-all|| ||disco.all.phd||tikgpu[02-10]||disco-all-phd (High priority)|| Access for TIK and DISCO members is granted on request by [[mailto:servicedesk-itet@id.ethz.ch|ID CxS institute support]]. === Overflow into gpu.normal === Jobs from DISCO users will overflow to partition [[Services/SLURM#sinfo_.2BIZI_Show_partition_configuration|gpu.normal]] in case all DISCO nodes are busy, as DISCO is a group contributing to the Slurm cluster besides owning nodes. === Show account membership === Check which account you're a member of with the following command: {{{#!highlight bash numbers=disable sacctmgr show users WithAssoc Format=User%-15,DefaultAccount%-15,Account%-15 }}} == Rules of conduct == There are no limits imposed on resources requested by jobs. Please be polite and share available resources sensibly. If you're in need of above-average resources, please coordinate with other DISCO Slurm users. == Improving the configuration == If you think the current configuration of DISCO nodes, partitions etc. could be improved: * Discuss your ideas with your team colleagues * Ask your [[mailto:servicedesk-itet@id.ethz.ch|ID CxS institute support]] who the current DISCO cluster coordinators are * Bring your suggestions for improvement to the coordinators The coordinators will streamline your ideas into a concrete change request which we (ISG D-ITET) will implement for you.