Difference between revisions of "Cluster:Compute nodes"

From Collective Computational Unit
Jump to navigation Jump to search
m (Acquiring GPUs with more than 20 GB)
m (Acquiring GPUs with more than 20 GB)
Line 115: Line 115:
 
If you need a GPU with *at least* 32 GB, but also would be happy with more, you have to
 
If you need a GPU with *at least* 32 GB, but also would be happy with more, you have to
  
# make the pod tolerate the taint "gpumem-32" *and* "gpumem-40".
+
# make the pod tolerate the taint "gpumem=32" *and* "gpumem=40".
 
# make the pod require the node label "gpumem" to be larger than 31.
 
# make the pod require the node label "gpumem" to be larger than 31.
  
Line 141: Line 141:
 
     # not sure if this works, maybe you need two tolerations for the
 
     # not sure if this works, maybe you need two tolerations for the
 
     # two different values, with an "Equal" operator.
 
     # two different values, with an "Equal" operator.
 +
    # maybe also a "Gt" works with 31, like above?
 
     operator: "In"
 
     operator: "In"
 
     values:
 
     values:

Revision as of 20:22, 27 November 2021

Targeting a specific node

Targeting a specific node can be done in two different ways, either selecting a node name directly, or requiring certain labels on the node. See table below for node names and associated labels. See the Kubernetes API documentation on how to assign pods to nodes, or refer to the following examples, which are probably self-explaining.


Selecting a node name

Example: GPU-enabled pod which runs only on the node "belial":

apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  nodeSelector:
    kubernetes.io/hostname: belial
  containers:
  - name: gpu-container
    image: nvcr.io/nvidia/tensorflow:20.09-tf2-py3
    command: ["sleep", "1d"]
    resources:
      requests:
        cpu: 1
        nvidia.com/gpu: 1
        memory: 10Gi
      limits:
        cpu: 1
        nvidia.com/gpu: 1
        memory: 10Gi
   # more specs (volumes etc.)


Requiring a certain label on the node

Example: GPU-enabled pod which requires compute capability of at least sm-75:

apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  nodeSelector:
    compute-capability-atleast-sm75: true
    # note: if a node has e.g. the label "compute-capability-sm80", it also has the
    # corresponding "atleast"-label for all lower or equal compute capabilities. Same holds for "gpumem".
  containers:
  - name: gpu-container
    image: nvcr.io/nvidia/tensorflow:20.09-tf2-py3
    command: ["sleep", "1d"]
    resources:
      requests:
        cpu: 1
        nvidia.com/gpu: 1
        memory: 10Gi
      limits:
        cpu: 1
        nvidia.com/gpu: 1
        memory: 10Gi
   # more specs (volumes etc.)

Acquiring GPUs with more than 20 GB

By default, Kubernetes schedules GPU pods only on the smallest class of GPU with 20 GB of memory. The way how this is achieved is that nodes with higher grade GPUs are assigned a "node taint", which makes the node only available to pods which specify that they are "tolerant" against the taint.

So if your tasks for example requires a GPU with *exactly* 32 GB, you have to

  1. make the pod tolerate the taint "gpumem=32:NoSchedule" (see table below).
  2. make the pod require the node label "gpumem" to be exactly 32.

See the Kubernetes API documentation on taints and tolerations for more details.


Example:

apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  nodeSelector:
    gpumem=32
  tolerations:
  - key: "gpumem"
    # Note: to be able to run on a GPU with any amount of memory, 
    # replace the operator/value pair by just 'operator: "Exists"'.
    operator: "Equal"
    value: "32"
    effect: "NoSchedule"
  containers:
  - name: gpu-container
    image: nvcr.io/nvidia/tensorflow:20.09-tf2-py3
    command: ["sleep", "1d"]
    resources:
      requests:
        cpu: 1
        nvidia.com/gpu: 1
        memory: 10Gi
      limits:
        cpu: 1
        nvidia.com/gpu: 1
        memory: 10Gi
   # more specs (volumes etc.)


If you need a GPU with *at least* 32 GB, but also would be happy with more, you have to

  1. make the pod tolerate the taint "gpumem=32" *and* "gpumem=40".
  2. make the pod require the node label "gpumem" to be larger than 31.

Example:


apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  # the standard node selector is insufficient here.
  # needs to use the more expressive "nodeAffinity".
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: gpumem
            operator: Gt
            value:
  tolerations:
  - key: "gpumem"
    # not sure if this works, maybe you need two tolerations for the
    # two different values, with an "Equal" operator.
    # maybe also a "Gt" works with 31, like above?
    operator: "In"
    values:
    - 32
    - 40
    effect: "NoSchedule"
  # ... rest of the specs like before

List of compute nodes

NOTE: Asmodeus and Demogorgon are ordered, but not installed yet, and taints are currently not yet in place.

The following nodes are currently part of the cluster. Note that the master node is CPU only and not used for computations, as it hosts all CCU infrastructure (among a few other things).

CCU name Access Platform GPUs Labels Taints
Asmodeus all Supermicro 4 x A100 HGX 320 GB, subdivided in 16 GPUs @ 20 GB gpumem=20, gpuarch=nvidia-a100, nvidia-compute-capability-sm80=true
Glasya trr161 Dual Xeon Rack 4 x Titan RTX @ 24 GB gpumem=24, gpuarch=nvidia-rtx, nvidia-compute-capability-sm80=true gpumem=24:NoSchedule
Belial exc-cb Supermicro 8 x Quadro RTX 6000 @ 24 GB gpumem=24, gpuarch=nvidia-rtx, nvidia-compute-capability-sm75=true gpumem=24:NoSchedule
Fierna exc-cb Supermicro 8 x Quadro RTX 6000 @ 24 GB gpumem=24, gpuarch=nvidia-rtx, nvidia-compute-capability-sm75=true gpumem=24:NoSchedule
Vecna exc-cb, inf nVidia DGX-2 16 x V100 @ 32 GB gpumem=32, gpuarch=nvidia-v100, nvidia-compute-capability-sm80=true gpumem=32:NoSchedule
Zariel trr161 nVidia DGX A100 8 x A100 @ 40 GB gpumem=40, gpuarch=nvidia-a100, nvidia-compute-capability-sm80=true gpumem=40:NoSchedule
Tiamat exc-cb Supermicro 4 x A100 @ 40 GB gpumem=40, gpuarch=nvidia-a100, nvidia-compute-capability-sm80=true gpumem=40:NoSchedule
Demogorgon exc-cb Delta 8 x A40 @ 48 GB gpumem=48, gpuarch=nvidia-a40, nvidia-compute-capability-sm80=true gpumem=48:NoSchedule


The CCU name is the internal name used in the Kubernetes cluster, as well as the configured hostname of the node. Nodes are not accessible from the outside world, you have to access the cluster via kubectl through the API-server.

In the column "Access" you can find which Kubernetes user groups can access this node.

Group Desciption
exc-cb Centre for the Advanced Study of Collective Behaviour
trr161 SFB Transregio 161 "Quantitative Methods for Visual Computing"
inf Department of Computer Science
cvia Computer Vision and Image Analysis Group