GPU Service
The EIDF GPU Service provides a containerised processing platform accelerated by NVIDIA GPUs to support scalable data processing and AI workloads.
Availability
This service is available for University of Edinburgh researchers and DDI Programme partners.
Please use the EIDF Application Portal to request access. You can find help and guidance on the process under "Service Access" in the technical documentation hosted on GitHub.
Service profile
The EIDF GPU Service is composed of 160 NVIDIA A100 GPUs, of which 112 will be available to EIDF users. These GPUs are hosted on 20 servers each of which offers 8 GPUs, 1TiB of RAM and 256 CPU cores.
The service is hosted on HPE Apollo 6500 Gen10 servers, each of which offers 8 GPUs, 1TiB of RAM and 256 CPU cores. The EIDF GPU Service employs Kubernetes and storage is provided by Ceph pools. The underlying node configurations can have up to 8 GPUs with support for access to MIG-enabled (multi-instance GPU) and full GPUs. Node configurations will be reviewed and updated during service operation.
A standard project namespace has the following initial quota (subject to ongoing review):
-
CPU: 100 Cores
-
Memory: 1TiB
-
GPU: 12
Note these quotas are maximum use by a single project, and that during periods of high usage Kubernetes Jobs maybe queued waiting for resource to become available on the cluster.
Introductory Materials for Containers
To make best use of the service, an understanding of containerisation is recommended. An introduction tutorial to containers for computational environments is available through the Carpentries workshop.