Opening of DGX A100 system

/
HPC Center news
/
Opening of DGX A100...

Dear Users,
we are happy to announce that the new NVIDIA DGX A100 system is now available in the pre-production phase. The system is based on the AMD Rome architecture and NVIDIA A100 GPUs, and it is designed to provide a universal system for all AI workloads, from analytic to training to inference.
The access to the system is granted via the Iscra program.
Due to the peculiar features of the system, it is not fully configured as the standard CINECA HPC clusters:
– the system is accessible to users with active projects on DGX via secure ssh connection to the login node login.dgx.cineca.it; – the software offered by the module environment is minimal; users are encouraged to run their simulations using singularity containers, pulling them from NVIDIA Docker or Singularity repositories; – the most representative AI datasets are already available on fast local disks and easily accessible via modules; we encourage users to recur to such datasets and to address to [email protected] any request of additional datasets; – the production environment relies on job submission to the SLURM scheduler (as all the other CINECA clusters) – the storage provides the standard user’s areas, $HOME and $CINECA_SCRATCH, as Gluster distributed filesystems on login and compute nodes. No $WORK area is instead provided to the accepted projects; – fast NVMe (Non Volatile Memory) storage areas are available on each node, exhibiting a significant speed-up in I/O operations with respect to the Gluster filesystems; the areas are accessible to jobs via the $TMPDIR variables, and they will be removed upon job completion.
A preliminar User Guide is available at the link:
https://wiki.u-gov.it/confluence/display/SCAIUS/UG3.4%3A+DGX+A100+User+G…
Best regards,
HPC User Support @ CINECA

Opening of DGX A100 system

News