Changes

← Older edit

Tutorials:Persistent volumes on the Kubernetes cluster

1,010 bytes added, 4 years ago

no edit summary

**** THIS IS OUTDATED INFORMATION, PLEASE REFER TO [[CCU:Perstistent storage on the Kubernetes cluster]] instead.

== Prerequisites ==

* Pre-requisited from [[Tutorials:Run_the_example_container_on_the_cluster|previous tutorial]].

* Sample code from [[Tutorials:Run_the_example_container_on_the_cluster|previous tutorial]].

== Global dataset storage for large, static datasets ==

The first cluster node exports an NFS filesystem on a large NVMe-Raid, which is reasonably fast and can be used as a global dataset storage. It can be mounted into a pod as follows:

apiVersion: v1

kind: Pod

metadata:

name: your-username-test-global-storage

spec:

containers:

- name: your-username-test-global-storage

# we use a small ubuntu base to access the PVC

image: ubuntu:18.04

# make sure that we have some time until the container quits by itself

command: ['sleep', '6h']

volumeMounts:

# Path to mount the NFS volume to

- mountPath: "/mnt/datasets"

name: datasets-nfs

# NFS is exported read-only

readOnly: true

volumes:

# Volume which mounts the NFS server exported to the cluster by ccu-node1

- name: datasets-nfs

nfs:

server: ccu-node1

path: /raid/datasets

</syntaxhighlight>

Please see the page [[CCU:Global dataset storage|on global storage]] for a list of available datasets and the method to upload your own.

* Local persistent volumes

* Global persistent volumes

~~Local~~ Note: the cluster will soon get large, fast global storage, at this point local persistent volumes will be phased out and probably not available anymore. Tensorboard monitoring should be ~~used to import training data and store results~~ done using service exports, as explained below, and ~~log files~~ not make use of ~~your training. There are special~~ local PVs ~~for monitoring the training using Tensorboard. Host directories are meant for common training data sets stored permanently on the host. They are always read only~~.

accessModes:

- ReadWriteOnce

- ReadOnlyMany

# For me (Felix) it worked only with the additional following line:

volumeMode: Filesystem

</syntaxhighlight>

Since anyone can mount global persistent volumes in the same namespace, they can and should be used to share datasets. The name of a PVC which contains a useful dataset should start with "dataset-" and be descriptive, so that it can easily be found by other users. Also, the root of the PVC should contain a README with informations about the dataset (at least the source and what exactly it is~~). Finally, it is probably good practice if other users of the dataset which are not the creator mount the volume readonly (by specifying "readOnly: true" after the mountPath in the pod's yaml~~).

~~=== Global dataset storage for large, static datasets ===~~ ~~Every node has a link to a global repository "/raid/datasets" in its filesystem, which sits~~ A note on ~~very fast NVMe raid~~ mounting. Currently (~~1.9 GB/s read~~will change in the near future) ~~and~~ , ceph volumes can be either mounted ~~read-~~ReadWrite by a single pod only ~~in every container as in~~ , or ReadOnly by multiple pods. Thus, the ~~following example:~~ ~~<syntaxhighlight lang="yaml">todo</syntaxhighlight>~~ ~~Please see~~ workflow for a static dataset is to create the ~~page [[CCU:Global dataset storage|on global storage]] for~~ PVC, then create a ~~list of available datasets and~~ pod to write all the ~~method~~ data to ~~upload your own~~it, then delete this pod and mount it read only from now on so it can be used in multiple pods.

== Reading/writing the contents of a persistent volume ==

Bastian.goldluecke

ccu, Administrators

684

edits

Changes

Tutorials:Persistent volumes on the Kubernetes cluster

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Collective Computational Unit

Mediawiki

Tools