DGX maintenance completed

  1. /
  2. HPC Center news
  3. /
  4. DGX maintenance completed

Dear users,
this is to inform you that the DGX maintenance has been completed and the cluster is back in production since yesterday evening.
During this maintenance:

SLURM has been updated to version 21.08.8-2
a new version of Nvidia HPC-SDK has been installed (2022 – 22.3)
max wall time of QoS “dgx_qos_sprod” has been extended from 12 hours to 48 hours
a new partition, “dgx_usr_preempt”, has been defined. It is free of charge and there is no limit on the number of jobs running per user, but it has low priority and your jobs may be killed at any moment if a high priority job requests resources
the DGX User Guide has been updated, please visit the webpage https://wiki.u-gov.it/confluence/display/SCAIUS/UG3.4%3A+DGX+A100+UserGuide

Best regards,
HPC User Support @CINECA