All Content Page

This is a page that displays all content on a single page.

10.1.1 Creating a custom task queue

Task worker container

Figure 10.1 Frontend web server with a background task queue

Task worker deployment

$ kubectl create -f Chapter09/9.2.2_StatefulSet_Redis_Multi
configmap/redis-config created
service/redis-service created
statefulset.apps/redis created

$ kubectl get pods
NAME      READY   STATUS    RESTARTS   AGE
redis-0   1/1     Running   0          6m17s
redis-1   1/1     Running   0          4m
redis-2   1/1     Running   0          2m41s

kubectl create -f Chapter10/10.1.1_TaskQueue/deploy_worker.yaml

$ kubectl get pods
NAME                        READY   STATUS    RESTARTS   AGE
pi-worker-f7565d87d-8tzb2   1/1     Running   0          3m8s
pi-worker-f7565d87d-dtrkd   1/1     Running   0          3m8s
redis-0                     1/1     Running   0          9m43s
redis-1                     1/1     Running   0          7m26s
redis-2                     1/1     Running   0          6m7s

$ kubectl logs -f deployment/pi-worker
Found 2 pods, using pod/pi-worker-55477bdf7b-7rmhp
starting

$ kubectl logs --selector pod=pi
starting
starting

Add work

Using exec, run a command to add tasks to the queue.

$ kubectl exec -it deploy/pi-worker -- python3 add_tasks.py
added task: 9500000
added task: 3800000
added task: 1900000
added task: 3600000
added task: 1200000
added task: 8600000
added task: 7800000
added task: 7100000
added task: 1400000
added task: 5600000
queue depth 8
done

Watch one of the workers process the work

$ kubectl logs -f deployment/pi-worker
Found 2 pods, using pod/pi-worker-54dd47b44c-bjccg
starting
got task: 9500000
3.1415927062213693620
got task: 8600000
3.1415927117293246813
got task: 7100000
3.1415927240123234505

10.1.2 Signal handling in worker Pods

Add signal handling:

Update the worker Deployment with terminationGracePeriodSeconds.

10.1.3 Scaling worker Pods

HPA can be used to scale workers for background queues:

Create it:

kubectl create -f Chapter10/10.1.3_HPA

Add some more tasks

$ kubectl exec -it deploy/pi-worker -- python3 add_tasks.py

Observe the results

$ kubectl get pods,hpa
NAME                            READY   STATUS      RESTARTS   AGE
pod/addwork-thwbq               0/1     Completed   0          6m11s
pod/pi-worker-f7565d87d-526pd   0/1     Pending     0          3s
pod/pi-worker-f7565d87d-7kpzt   0/1     Pending     0          3s
pod/pi-worker-f7565d87d-8tzb2   1/1     Running     0          10m
pod/pi-worker-f7565d87d-dtrkd   1/1     Running     0          10m
pod/redis-0                     1/1     Running     0          17m
pod/redis-1                     1/1     Running     0          14m
pod/redis-2                     1/1     Running     0          13m

NAME                                                       REFERENCE               TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/pi-worker-autoscaler   Deployment/pi-worker    37%/20%         2         10        2          34s
horizontalpodautoscaler.autoscaling/timeserver             Deployment/timeserver   <unknown>/20%   1         10        1          5h6m

10.2.1 Running one-off tasks with Jobs

Create a Job to add the work to the queue instead of using exec.

$ kubectl create -f Chapter10/10.2.1_Job/job_addwork.yaml
job.batch/addwork created

After a time it will show as “Completed”

$ kubectl get job,pods
NAME                COMPLETIONS   DURATION   AGE
job.batch/addwork   1/1           2m57s      3m2s

NAME                            READY   STATUS      RESTARTS   AGE
pod/addwork-thwbq               0/1     Completed   0          3m2s
pod/pi-worker-f7565d87d-8tzb2   1/1     Running     0          7m20s
pod/pi-worker-f7565d87d-dtrkd   1/1     Running     0          7m20s
pod/redis-0                     1/1     Running     0          13m
pod/redis-1                     1/1     Running     0          11m
pod/redis-2                     1/1     Running     0          10m

Observe, as before

$ kubectl get pods,hpa
NAME                            READY   STATUS    RESTARTS   AGE
pod/pi-worker-f7565d87d-2hlsj   0/1     Pending   0          29s
pod/pi-worker-f7565d87d-526pd   0/1     Pending   0          59s
pod/pi-worker-f7565d87d-5ssrw   0/1     Pending   0          29s
pod/pi-worker-f7565d87d-65sbv   0/1     Pending   0          14s
pod/pi-worker-f7565d87d-7kpzt   0/1     Pending   0          59s
pod/pi-worker-f7565d87d-86pvh   0/1     Pending   0          14s
pod/pi-worker-f7565d87d-8tzb2   1/1     Running   0          11m
pod/pi-worker-f7565d87d-bdrft   0/1     Pending   0          29s
pod/pi-worker-f7565d87d-dtrkd   1/1     Running   0          11m
pod/pi-worker-f7565d87d-vchzt   0/1     Pending   0          29s
pod/redis-0                     1/1     Running   0          18m
pod/redis-1                     1/1     Running   0          15m
pod/redis-2                     1/1     Running   0          14m

NAME                                                       REFERENCE               TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/pi-worker-autoscaler   Deployment/pi-worker    99%/20%         2         10        8          90s
horizontalpodautoscaler.autoscaling/timeserver             Deployment/timeserver   <unknown>/20%   1         10        1          5h7m

You can’t run the job twice, as the object remains even in the completed state. Clean up the “Completed” Pod like so

kubectl delete -f Chapter10/10.2.1_Job/job_addwork.yaml

10.2.2 Scheduling tasks with CronJobs

Figure 10.4 Object composition of CronJob

Create the CronJob

$ kubectl create -f Chapter10/10.2.2_CronJob/cronjob_addwork.yaml
cronjob.batch/addwork created

$ kubectl get cronjob,job
NAME                    SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
cronjob.batch/addwork   */5 * * * *   False     0        <none>          5s

After 5 minutes you should see that it has spawned a Job.

$ kubectl get cronjob,job,pods
NAME                    SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
cronjob.batch/addwork   */5 * * * *   False     0        46s             104s

NAME                         COMPLETIONS   DURATION   AGE
job.batch/addwork-28330610   1/1           5s         46s

NAME                             READY   STATUS        RESTARTS   AGE
pod/addwork-28330610-fkxmq       0/1     Completed     0          46s
pod/pi-worker-674b89665f-299qc   1/1     Running       0          6s
pod/pi-worker-674b89665f-5vg7r   1/1     Running       0          14s
pod/pi-worker-674b89665f-7sfxd   1/1     Running       0          54s
pod/pi-worker-674b89665f-952bj   1/1     Running       0          14s
pod/pi-worker-674b89665f-dtxsc   1/1     Running       0          29s
pod/pi-worker-674b89665f-htlbc   1/1     Terminating   0          14s
pod/pi-worker-674b89665f-jdwtp   1/1     Running       0          6s
pod/pi-worker-674b89665f-mbvnv   1/1     Running       0          29s
pod/pi-worker-674b89665f-q4nbd   1/1     Running       0          6s
pod/pi-worker-674b89665f-rgrkm   1/1     Running       0          14s
pod/pi-worker-674b89665f-rvsxp   1/1     Running       0          54s
pod/pi-worker-674b89665f-x469p   1/1     Terminating   0          11s
pod/pi-worker-674b89665f-xgph5   1/1     Terminating   0          11s
pod/redis-0                      1/1     Running       0          24m
pod/redis-1                      1/1     Running       0          22m
pod/redis-2                      1/1     Running       0          20m

10.3.1 Dynamic queue processing with Jobs

$ cd Chapter10
$ kubectl delete -f 10.1.2_TaskQueue2
deployment.apps "pi-worker" deleted

$ kubectl delete -f 10.2.1_Job
job.batch "addwork" deleted

$ kubectl delete -f 10.2.2_CronJob
cronjob.batch "addwork" deleted

Since our Redis-based queue may have some existing jobs, you can reset it as well using the LTRIM Redis command:

$ kubectl exec -it pod/redis-0 -- redis-cli ltrim queue:task 0 0
OK

You can also run the redis-cli interactively if you prefer to reset the queue:

$ kubectl exec -it pod/redis-0 -- redis-cli
127.0.0.1:6379> LTRIM queue:task 0 0
OK

$ cd Chapter10
$ kubectl create -f 10.2.1_Job
job.batch/addwork created

$ kubectl get job,pod
NAME                COMPLETIONS   DURATION   AGE
job.batch/addwork   1/1           3s         10s

NAME                             READY   STATUS        RESTARTS   AGE
pod/addwork-6svdp                0/1     Completed     0          10s
pod/redis-0                      1/1     Running       0          26m
pod/redis-1                      1/1     Running       0          23m
pod/redis-2                      1/1     Running       0          22m

$ kubectl create -f 10.3.1_JobWorker
job.batch/jobworker created

While the workers are running

$ kubectl get job,pod
NAME                  COMPLETIONS   DURATION   AGE
job.batch/addwork     1/1           5s         29s
job.batch/jobworker   0/1 of 2      16s        16s

NAME                  READY   STATUS      RESTARTS   AGE
pod/addwork-79q4k     0/1     Completed   0          29s
pod/jobworker-drwfm   1/1     Running     0          16s
pod/jobworker-z952g   1/1     Running     0          16s
pod/redis-0           1/1     Running     0          28m
pod/redis-1           1/1     Running     0          25m
pod/redis-2           1/1     Running     0          24m

You can query how much work is in the queue

$ kubectl exec -it pod/redis-0 -- redis-cli llen queue:task
(integer) 5

When the queue is zero, the workers should hit the Completed state after processing their last job.

$ kubectl get job,pod
NAME                  COMPLETIONS   DURATION   AGE
job.batch/addwork     1/1           5s         3m51s
job.batch/jobworker   2/1 of 2      2m27s      3m38s

NAME                  READY   STATUS      RESTARTS   AGE
pod/addwork-79q4k     0/1     Completed   0          3m51s
pod/jobworker-drwfm   0/1     Completed   0          3m38s
pod/jobworker-z952g   0/1     Completed   0          3m38s
pod/redis-0           1/1     Running     0          31m
pod/redis-1           1/1     Running     0          29m
pod/redis-2           1/1     Running     0          27m

Run again

The old Job objects remain, even when completed. To run it again, delete the add work and worker Jobs.

$ kubectl delete -f 10.2.1_Job
job.batch "addwork" deleted
$ kubectl delete -f 10.3.1_JobWorker
job.batch "jobworker" deleted

Run the add work Job again

$ kubectl create -f 10.2.1_Job
job.batch/addwork created

Wait for the addwork Job to complete, then

$ kubectl create -f 10.3.1_JobWorker
job.batch/jobworker created

11.1 Production and staging environments using namespaces

kubectl create namespace staging

kubectl config set-context --current --namespace=staging

Now we can create the Deployment, even if it already exists in the cluster in the default namespace

kubectl create -f Chapter03/3.2_DeployingToKubernetes

Compare the view of Pods from the current namespace (staging)

$ kubectl get pods
NAME                         READY   STATUS    RESTARTS   AGE
timeserver-f94cc5dd9-6b7lv   0/1     Pending   0          49s
timeserver-f94cc5dd9-jg47t   0/1     Pending   0          49s
timeserver-f94cc5dd9-kskkw   0/1     Pending   0          49s

To the default namespace

$ kubectl get pods -n default
NAME      READY   STATUS    RESTARTS   AGE
redis-0   1/1     Running   0          38m
redis-1   1/1     Running   0          36m
redis-2   1/1     Running   0          35m

12.1.3 Handling disruptions

Figure 12.1 Node deletion without Pod disruption budgets. All the Pods on the node will become unavailable at once.

kubectl create -f Chapter12/12.1_PDB/pdb.yaml

Figure 12.2 With a PDB, Kubernetes will wait for the required number of Pods in a Deployment to be available before deleting others, reducing the disruption.

To try, cordon and drain every node to simulate an upgrade

for node in $(kubectl get nodes -o jsonpath='{.items[*].metadata.name}'); do
  kubectl cordon $node
  kubectl drain $node --ignore-daemonsets --delete-emptydir-data
done

Then, watch the Pods

$ watch -d kubectl get pods
NAME                         READY   STATUS              RESTARTS   AGE
timeserver-f94cc5dd9-5wgnm   0/1     ContainerCreating   0          25s
timeserver-f94cc5dd9-gjc7n   1/1     Running             0          116s
timeserver-f94cc5dd9-jg47t   1/1     Terminating         0          8m45s
timeserver-f94cc5dd9-kskkw   1/1     Running             0          8m45s

Since we specified maxUnavailable: 1 there should always be 2 out of the 3 replicas in the Running state. Note that this isn’t a hard guarantee and events can disrupt this count temporarily.

12.2 Deploying node agents with DaemonSet

Inspect existing Daemonsets

$ kubectl get daemonset -n kube-system
NAME                                     DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                                                             AGE
anetd                                    7         7         6       7            6           kubernetes.io/os=linux                                                    18h
anetd-win                                0         0         0       0            0           kubernetes.io/os=windows                                                  18h
filestore-node                           7         7         6       7            6           kubernetes.io/os=linux                                                    18h
fluentbit-gke-big                        7         7         6       7            6           kubernetes.io/os=linux                                                    18h
fluentbit-gke-small                      0         0         0       0            0           kubernetes.io/os=linux                                                    18h
gcsfusecsi-node                          7         7         5       7            5           kubernetes.io/os=linux                                                    18h
gke-metadata-server                      7         7         6       7            1           beta.kubernetes.io/os=linux,iam.gke.io/gke-metadata-server-enabled=true   18h
gke-metrics-agent                        7         7         6       7            6           <none>                                                                    18h
gke-metrics-agent-scaling-10             0         0         0       0            0           <none>                                                                    18h
gke-metrics-agent-scaling-100            0         0         0       0            0           <none>                                                                    18h
gke-metrics-agent-scaling-20             0         0         0       0            0           <none>                                                                    18h
gke-metrics-agent-scaling-200            0         0         0       0            0           <none>                                                                    18h
gke-metrics-agent-scaling-50             0         0         0       0            0           <none>                                                                    18h
gke-metrics-agent-scaling-500            0         0         0       0            0           <none>                                                                    18h
gke-metrics-agent-windows                0         0         0       0            0           kubernetes.io/os=windows                                                  18h
image-package-extractor                  7         7         5       7            5           kubernetes.io/os=linux                                                    18h
ip-masq-agent                            7         7         6       7            6           kubernetes.io/os=linux,node.kubernetes.io/masq-agent-ds-ready=true        18h
kube-proxy                               0         0         0       0            0           kubernetes.io/os=linux,node.kubernetes.io/kube-proxy-ds-ready=true        18h
metadata-proxy-v0.1                      0         0         0       0            0           cloud.google.com/metadata-proxy-ready=true,kubernetes.io/os=linux         18h
nccl-fastsocket-installer                0         0         0       0            0           <none>                                                                    18h
netd                                     7         7         6       7            6           cloud.google.com/gke-netd-ready=true,kubernetes.io/os=linux               18h
node-local-dns                           7         7         6       7            6           addon.gke.io/node-local-dns-ds-ready=true                                 18h
nvidia-gpu-device-plugin-large-cos       0         0         0       0            0           <none>                                                                    18h
nvidia-gpu-device-plugin-large-ubuntu    0         0         0       0            0           <none>                                                                    18h
nvidia-gpu-device-plugin-medium-cos      0         0         0       0            0           <none>                                                                    18h
nvidia-gpu-device-plugin-medium-ubuntu   0         0         0       0            0           <none>                                                                    18h
nvidia-gpu-device-plugin-small-cos       0         0         0       0            0           <none>                                                                    18h
nvidia-gpu-device-plugin-small-ubuntu    0         0         0       0            0           <none>                                                                    18h
pdcsi-node                               7         7         6       7            6           kubernetes.io/os=linux                                                    18h
pdcsi-node-windows                       0         0         0       0            0           kubernetes.io/os=windows                                                  18h
runsc-metric-server                      0         0         0       0            0           kubernetes.io/os=linux,sandbox.gke.io/runtime=gvisor                      18h
tpu-device-plugin                        0         0         0       0            0           <none>                                                                    18h

Make our own

$ kubectl create -f Chapter12/12.2_DaemonSet/logreader.yaml
daemonset.apps/logreader created

Once the Pods are ready

$ kubectl get pods
NAME                         READY   STATUS    RESTARTS   AGE
logreader-7c6rc              1/1     Running   0          30s
logreader-8r6j4              1/1     Running   0          30s
logreader-l8448              1/1     Running   0          30s
timeserver-f94cc5dd9-65bl9   1/1     Running   0          3m59s
timeserver-f94cc5dd9-ljx4w   1/1     Running   0          2m14s
timeserver-f94cc5dd9-wqh6s   1/1     Running   0          93s

we can stream the logs:

$ kubectl logs -f logreader-2nbt4 --tail 10
==> /var/log/containers/filestore-node_kube-system_gcp-filestore-1b5.log <==
lock is held by gk3-autopilot-cluster-2sc2_e4337a2e and has not yet expired

12.3 Pod security context

Some Pods can request additional privledges

This won’t work in an Autopilot cluster, one of the additional security properties of Autopilot

kubectl create -f Chapter12/12.3_PodSecurityContext/admin-ds.yaml
Error from server (GKE Warden constraints violations): error when creating "Chapter12/12.3_PodSecurityContext/admin-ds.yaml": admission webhook "warden-validating.common-webhooks.networking.gke.io" denied the request: GKE Warden rejected the request because it violates one or more constraints.
Violations details: {"[denied by autogke-disallow-privilege]":["container admin-container is privileged; not allowed in Autopilot"]}
Requested by user: '[email protected]', groups: 'system:authenticated'.

If you run it in a GKE node-based cluster

$ kubectl create -f Chapter12/12.3_PodSecurityContext/admin-ds.yaml
daemonset.apps/admin-workload created

And exec into one of the Pods

$ kubectl get pods
NAME                   READY   STATUS    RESTARTS   AGE
admin-workload-65mpz   1/1     Running   0          10s
admin-workload-l5twl   1/1     Running   0          9s
admin-workload-sh8gd   1/1     Running   0          9s
~/kubernetes-for-developers$ kubectl exec -it admin-workload-65mpz -- bash

You can perform privledged operations like mounting the root filesystem

root@admin-workload-65mpz:/# df
Filesystem     1K-blocks    Used Available Use% Mounted on
overlay         98831908 3954368  94861156   5% /
tmpfs              65536       0     65536   0% /dev
/dev/sda1       98831908 3954368  94861156   5% /etc/hosts
shm                65536       0     65536   0% /dev/shm
tmpfs            2877096      12   2877084   1% /run/secrets/kubernetes.io/serviceaccount
root@admin-workload-65mpz:/# mkdir /tmp/host
root@admin-workload-65mpz:/# mount /dev/sda1 /tmp/host
root@admin-workload-65mpz:/# cd /tmp/host
root@admin-workload-65mpz:/tmp/host# ls
dev_image  etc  home  lost+found  var  var_overlay  vmlinuz_hd.vblock
root@admin-workload-65mpz:/tmp/host#

By contrast, the following Pod has no such privledges

If you attempt to mount the host disk here, you will get an error like “special device /dev/sda1 does not exist.”

$ kubectl create -f Chapter12/12.3_PodSecurityContext/pod.yaml
pod/ubuntu created

$ kubectl exec -it ubuntu -- bash
I have no name!@ubuntu:/$ df
Filesystem     1K-blocks    Used Available Use% Mounted on
overlay         98831908 4096440  94719084   5% /
tmpfs              65536       0     65536   0% /dev
/dev/sda1       98831908 4096440  94719084   5% /etc/hosts
shm                65536       0     65536   0% /dev/shm
tmpfs            2877096      12   2877084   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs            2011476       0   2011476   0% /proc/acpi
tmpfs            2011476       0   2011476   0% /proc/scsi
tmpfs            2011476       0   2011476   0% /sys/firmware
I have no name!@ubuntu:/$ mkdir /tmp/host
I have no name!@ubuntu:/$ mount /dev/sda1 /tmp/host
mount: /tmp/host: special device /dev/sda1 does not exist.
I have no name!@ubuntu:/$

Non-root contianer error

Create this contianer

kubectl create -f Chapter12/12.4_NonRootContainers/1_permission_error/deploy.yaml

You’ll see CreateContainerConfigError

$ kubectl get pods
NAME                          READY   STATUS                       RESTARTS   AGE
timeserver-7f74d78bd7-dsrkv   0/1     CreateContainerConfigError   0          14s

Can investigate further with describe

$ kubectl describe pod timeserver-7f74d78bd7-dsrkv
Name:             timeserver-7f74d78bd7-dsrkv
Events:
  Type     Reason     Age               From                                   Message
  ----     ------     ----              ----                                   -------
  Warning  Failed     9s (x4 over 34s)  kubelet                                Error: container has runAsNonRoot and image will run as root (pod: "timeserver-7f74d78bd7-dsrkv_default(861b62db-3ab7-43ff-9560-75c5cad3be27)", container: timeserver-container)

See “Error: container has runAsNonRoot and image will run as root (pod: “timeserver-7f74d78bd7-dsrkv_default(861b62db-3ab7-43ff-9560-75c5cad3be27)”, container: timeserver-container)”

Runas

To fix, we can update the Deployment to specify the non-root user

Replace the previous one with

kubectl replace -f Chapter12/12.4_NonRootContainers/1_permission_error/deploy-runas.yaml

Now the Pod schedules, but crashes

$ kubectl get pods
NAME                         READY   STATUS   RESTARTS      AGE
timeserver-5d5449846-r7kpj   0/1     Error    2 (23s ago)   26s

View the logs

$ kubectl logs timeserver-5d5449846-r7kpj
Traceback (most recent call last):
  File "/app/server.py", line 52, in <module>
    startServer()
  File "/app/server.py", line 45, in startServer
    server = ThreadingHTTPServer(('', 80), RequestHandler)
  File "/usr/local/lib/python3.10/socketserver.py", line 452, in __init__
    self.server_bind()
  File "/usr/local/lib/python3.10/http/server.py", line 137, in server_bind
    socketserver.TCPServer.server_bind(self)
  File "/usr/local/lib/python3.10/socketserver.py", line 466, in server_bind
    self.socket.bind(self.server_address)
PermissionError: [Errno 13] Permission denied

Fix

Request lower Port

Rewire service

But this isn’t enough, you’ll still see an error

$ kubectl logs timeserver-demo-5fd5f6c7f9-cxzrb
10.22.0.129 - - [24/Mar/2022 02:10:43] “GET / HTTP/1.1” 200 –
Exception occurred during processing of request from (‘10.22.0.129’, 41702)
Traceback (most recent call last):
File “/usr/local/lib/python3.10/socketserver.py”, line 683, in
process_request_thread
self.finish_request(request, client_address)
File “/usr/local/lib/python3.10/socketserver.py”, line 360, in
finish_request
self.RequestHandlerClass(request, client_address, self)
File “/usr/local/lib/python3.10/socketserver.py”, line 747, in
__init__
self.handle()
File “/usr/local/lib/python3.10/http/server.py”, line 425, in
handle
self.handle_one_request()
File “/usr/local/lib/python3.10/http/server.py”, line 413, in
handle_one_request
method()
File “/app/server.py”, line 11, in do_GET
with open(“logs/log.txt”, “a”) as myfile:
PermissionError: [Errno 13] Permission denied: ‘logs/log.txt’

Need to also adjust permissions of the logs folder

This is contained in version 7, updating the Deployment:

Deploy these fixed versions

kubectl replace -f Chapter12/12.4_NonRootContainers/2_fixed/deploy.yaml
kubectl replace -f Chapter12/12.4_NonRootContainers/2_fixed/service.yaml

Now it’s working

$ kubectl get pods
NAME                          READY   STATUS    RESTARTS   AGE
timeserver-849d7b67d7-cgfz2   1/1     Running   0          19s

12.5 Admission controllers

kubectl create -f Chapter12/12.5_PodSecurityAdmission/namespace.yaml
kubectl config set-context --current --namespace=team1

Try to create a Pod that runs as root. It will be rejected by the Pod Security Admission

$ kubectl create -f Chapter03/3.2.4_ThePodSpec/pod.yaml
Error from server (Forbidden): error when creating "Chapter03/3.2.4_ThePodSpec/pod.yaml": pods "timeserver" is forbidden: violates PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "timeserver-container" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "timeserver-container" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "timeserver-container" must set securityContext.runAsNonRoot=true)

A non-root Pod however can run

$ kubectl create -f Chapter12/12.5_PodSecurityAdmission/nonroot_pod.yaml
pod/timeserver-pod created

Cleanup

$ kubectl delete ns team1
namespace "team1" deleted

12.6 Role-based access control

$ cd Chapter12/12.6_RBAC/
$ kubectl create ns team1
namespace/team1 created
$ kubectl create -f role.yaml
krole.rbac.authorization.k8s.io/developer-access created

Edit the role binding to specify an account you

$ kubectl create -f rolebinding.yaml
rolebinding.rbac.authorization.k8s.io/developerA created

$ kubectl config set-context --current --namespace=default
$ kubectl create -f Chapter03/3.2_DeployingToKubernetes/deploy.yaml
Error from server (Forbidden): error when creating
"Chapter03/3.2_DeployingToKubernetes/deploy.yaml": deployments.apps is
forbidden: User "[email protected]" cannot create resource "deployments" in
API group "apps" in the namespace "default": requires one of
["container.deployments.create"] permission(s).

$ kubectl config set-context --current --namespace=team1
Context "gke_project-name_us-west1_cluster-name" modified.
$ kubectl create -f Chapter03/3.2_DeployingToKubernetes/deploy.yaml
deployment.apps/timeserver created


$ kubectl label --overwrite ns team1 pod-security.kubernetes.io/enforce=privileged
Error from server (Forbidden): namespaces "team1" is forbidden: User
"[email protected]" cannot patch resource "namespaces" in API group "" in
the namespace "team1": requires one of ["container.namespaces.update"]
permission(s).

Cluster Roles

2.1.2 Running commands in Docker

Create a Bash shell in docker

$ docker run -it ubuntu bash
root@18e78382b32e:/#

Install python

apt-get update
apt-get install -y python3

Run python interactively

# python3
>>> print("Hello Docker")
Hello Docker
>>> exit()
#

Create a Python script and run it

# echo 'print("Hello Docker")' > hello.py
# python3 hello.py
Hello Docker

When you’re done:

root@5aa832cc450b:/# exit
exit

List docker images

$ docker ps -a
CONTAINER ID   IMAGE     COMMAND   CREATED         STATUS                        PORTS     NAMES
5aa832cc450b   ubuntu    "bash"    4 minutes ago   Exited (127) 15 seconds ago             blissful_clarke

Start an existing image and attach

$ CONTAINER_ID=c5e023cab033
$ docker start $CONTAINER_ID
$ docker attach $CONTAINER_ID
# echo "run more commands"
# exit

Delete (purge, prune) images

$ docker system prune -a
WARNING! This will remove:
  - all stopped containers
  - all networks not used by at least one container
  - all images without at least one container associated to them
  - all build cache

Are you sure you want to continue? [y/N] y
Deleted Containers:
5aa832cc450b21238fd9e136e42a313ed6560e9aa3b09d7e6bf7413a4b04af3b

Deleted Images:
untagged: ubuntu:latest
untagged: ubuntu@sha256:2b7412e6465c3c7fc5bb21d3e6f1917c167358449fecac8176c6e496e5c1f05f
deleted: sha256:e4c58958181a5925816faa528ce959e487632f4cfd192f8132f71b32df2744b4
deleted: sha256:256d88da41857db513b95b50ba9a9b28491b58c954e25477d5dad8abb465430b

Total reclaimed space: 153.6MB

2.1.3 Building our own images

Get the book’s examples

git clone https://github.com/WilliamDenniss/kubernetes-for-developers.git
cd kubernetes-for-developers

Build this simple container

cd Chapter02/2.1.3_Dockerfile/
docker build . -t hello

Run it

$ docker run hello python3 hello.py
Hello Docker

2.1.4 Using base images

Dockerfile with a base image

Run it:

$ cd Chapter02/2.1.4_BaseImage
$ docker build . -t hello2
$ docker run hello2 python3 hello.py
Hello Docker

2.1.5 Adding a default command

A Dockerfile with a default command

Run it:

$ cd Chapter02/2.1.5_DefaultCommand
$ docker build . -t hello3
$ docker run hello3
Hello Docker

2.1.6 Adding dependencies

Adding dependencies to a Dockerfile

Example 1:

Example 2:

2.1.7 Compiling code in Docker

Compiling code in a Dockerfile

Build and run

$ cd Chapter02/2.1.7_CompiledCode
$ docker build . -t compiled_code
$ docker run compiled_code
Hello Docker

2.1.8 Compiling code with a multistage build

Using a multi-step build to compile code.

It’s not ideal to compile code directly in the final container. For example, run ls on the prior container

$ docker run compiled_code ls
Dockerfile
Hello.class
Hello.java

Instead, we can use a 2-stage build

Figure 2.2 A multistage container build, where an intermediate container is used to build the binary

Here’s the dockerfile

Build & run

$ cd Chapter02/2.1.8_MultiStage
$ docker build . -t compiled_code2
$ docker run compiled_code2
Hello Docker

Compare with ls:

$ docker run compiled_code2 ls
Hello.class

2.2.1 Containerizing an application server

Containerized server

Build

cd Chapter02/timeserver
docker build . -t timeserver

Run

$ docker run -it -p 8080:80 timeserver
Listening on 0.0.0.0:80

Test

From a new command shell:

$ curl http://localhost:8080
The time is 1:30 PM, UTC.

To test in the browser, you need to connect to the container on 8080. If you’re using Cloud Shell, use the web preview feature and select “Preview on port 8080” as pictured. You can change the port number if you need.

Cloud Shell Web Preview

One-liner

The build and run steps can be combined into a convenient one-liner:

docker build . -t timeserver; docker run -it -p 8080:80 timeserver

2.2.2 Debugging

Some commands to debug Docker

List processes

$ docker ps
CONTAINER ID   IMAGE        COMMAND                  CREATED          STATUS          PORTS                  NAMES
e3ce2ae9f6fe   timeserver   "/bin/sh -c 'python3…"   34 seconds ago   Up 33 seconds   0.0.0.0:8080->80/tcp   hopeful_turing
$ ls=6989d3097d6b
$ docker exec -it $CONTAINER_ID sh
# ls
Dockerfile server.py
# exit
$

Copy a file to the container

docker cp server.py $CONTAINER_ID:/app

Copy a file from the container

docker cp $CONTAINER_ID:/app/server.py .

3.2.1 Creating a cluster

Create Cluster

Create an Autopilot cluster in the UI

Figure 3.8 GKE Autopilot’s cluster creation UI

or via CLI:

CLUSTER_NAME=my-cluster
REGION=us-west1
gcloud container clusters create-auto $CLUSTER_NAME --region $REGION

Authenticate to the cluster

Copy the cluster connection command from the UI

Or CLI:

gcloud container clusters get-credentials $CLUSTER_NAME --region $REGION

Test your connection:

$ kubectl get pods
No resources found in default namespace.

Other Environments

If you’re not using Cloud Shell, you’ll need to configure gcloud.

Download gcloud: https://cloud.google.com/sdk/install

gcloud init

gcloud components install kubectl

3.2.2 Uploading your container

Now it’s time to upload your container to a container registry.

Create a new repo

Go to https://console.cloud.google.com/artifacts and create a new repository of type Docker in your desired location.

Create Artifact Registry

Use the UI to copy the path that is generated, which will look something like us-west1-docker.pkg.dev/gke-autopilot-test/timeserver.

Create Artifact Registry

Authenticate

Cloud Shell is configured automatically. On other environments, take the host from that path, and use it here to authenticate docker:

HOST_NAME=us-west1-docker.pkg.dev
gcloud auth configure-docker $HOST_NAME

Tag

Append the image name and version timeserver:1 to the repository path. For example for the path us-west1-docker.pkg.dev/gke-autopilot-test/timeserver you would use us-west1-docker.pkg.dev/gke-autopilot-test/timeserver/timeserver:1

You can tag an existig image, like so (where timeserver was the previous tag).

IMAGE_TAG=us-west1-docker.pkg.dev/gke-autopilot-test/timeserver/timeserver:1
docker tag timeserver $IMAGE_TAG

List your local images to find those tags

$ docker images
REPOSITORY   TAG       IMAGE ID       CREATED         SIZE
timeserver   latest    7f15ce6d0d5b   4 seconds ago   1.02GB

No existing image? No problem. Build and tag at the same time:

IMAGE_TAG=us-west1-docker.pkg.dev/gke-autopilot-test/timeserver/timeserver:1
cd Chapter02/timeserver
docker build . -t $IMAGE_TAG

Push

Once authenticated and tagged correctly, you can push the image

$ docker push $IMAGE_TAG
The push refers to repository [us-west1-docker.pkg.dev/gke-autopilot-test/timeserver/timeserver]
5f70bf18a086: Pushed 
df69a0f30478: Pushed 
701d0b971f5f: Pushed 
619584b251c8: Pushed 
ac630c4fd960: Pushed 
86e50e0709ee: Pushed 
12b956927ba2: Pushed 
266def75d28e: Pushed 
29e49b59edda: Pushed 
1777ac7d307b: Pushed 
1: digest: sha256:d679fbd18395da53b5bad05457f992b334fc230398f94e3aa906378d996646bd size: 2420

The image should now be in Artifact Registry

Create Artifact Registry

3.2.3 Deploying to Kubernetes

Lets deploy the container to Kubernetes!

Deploy

cd Chapter03/3.2_DeployingToKubernetes/
kubectl create -f deploy.yaml

Query the status

$ kubectl get deploy
NAME         READY   UP-TO-DATE   AVAILABLE   AGE
timeserver   0/3     3            0           7s

It may take 1-2 minutes to become ready.

$ kubectl get pods
NAME                         READY   STATUS    RESTARTS   AGE
timeserver-f94cc5dd9-jntt4   0/1     Pending   0          39s
timeserver-f94cc5dd9-swl4h   0/1     Pending   0          39s
timeserver-f94cc5dd9-xwbnx   0/1     Pending   0          39s

Now it’s ready

$ kubectl get pods
NAME                         READY   STATUS    RESTARTS   AGE
timeserver-f94cc5dd9-jntt4   1/1     Running   0          3m53s
timeserver-f94cc5dd9-swl4h   1/1     Running   0          3m53s
timeserver-f94cc5dd9-xwbnx   1/1     Running   0          3m53s

You can also view just the Pods in this deployment by referencing the label

$ kubectl get pods --selector=pod=timeserver-pod
NAME                         READY   STATUS    RESTARTS   AGE
timeserver-f94cc5dd9-jntt4   1/1     Running   0          4m7s
timeserver-f94cc5dd9-swl4h   1/1     Running   0          4m7s
timeserver-f94cc5dd9-xwbnx   1/1     Running   0          4m7s

To hit the service, forward the port

$ kubectl port-forward deploy/timeserver 8080:80
Forwarding from 127.0.0.1:8080 -> 80
Forwarding from [::1]:8080 -> 80

View with curl or Cloud Shell’s web preview like sectiion 2.2.1

To view and follow the logs for a container in the deployment:

$ kubectl logs -f deploy/timeserver
Found 3 pods, using pod/timeserver-8bbb895dc-kgl8l
Listening on 0.0.0.0:80
127.0.0.1 - - [09:59:08] “GET / HTTP/1.1” 200 –

You can also follow the logs of a specific container.

Or to print all logs for the pods:

$ kubectl logs --selector=pod=timeserver-pod
Listening on 0.0.0.0:80
127.0.0.1 - - [12/Nov/2023 18:47:04] "GET /?authuser=0&redirectedPreviously=true HTTP/1.1" 200 -
127.0.0.1 - - [12/Nov/2023 18:47:05] "GET /favicon.ico HTTP/1.1" 200 -
127.0.0.1 - - [12/Nov/2023 18:50:04] "GET /?authuser=0&redirectedPreviously=true HTTP/1.1" 200 -
127.0.0.1 - - [12/Nov/2023 18:50:05] "GET /?authuser=0&redirectedPreviously=true HTTP/1.1" 200 -
127.0.0.1 - - [12/Nov/2023 18:50:07] "GET /?authuser=0&redirectedPreviously=true HTTP/1.1" 200 -
127.0.0.1 - - [12/Nov/2023 18:50:08] "GET /favicon.ico HTTP/1.1" 200 -
Listening on 0.0.0.0:80
Listening on 0.0.0.0:80

3.2.4 The PodSpec

About Kubernetes object composition.

The Deployment is actually composed of a Pod

Figure 3.10 Pod object embedded in the Deployment object

Note the label selector

Figure 3.11 Relationship of the Deployment’s selector and the Pod template’s labels

3.2.5 Publishing your Service

Exposing the Deployment to the internet.

Next, expose the service.

Figure 3.12 Relationship between the Service and the Pods it targets (selects)

Create the service

cd Chapter03/3.2_DeployingToKubernetes
kubectl create -f service.yaml

View the status. It weill initially show as <pending>

$ kubectl get service
NAME         TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
kubernetes   ClusterIP      34.118.224.1     <none>        443/TCP        11h
timeserver   LoadBalancer   34.118.230.169   <pending>     80:31212/TCP   3s

You can watch for changes with -w, like:

$ kubectl get service -w
NAME         TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
kubernetes   ClusterIP      34.118.224.1     <none>        443/TCP        11h
timeserver   LoadBalancer   34.118.230.169   <pending>     80:31212/TCP   36s
timeserver   LoadBalancer   34.118.230.169   203.0.113.16  80:31212/TCP   44s

When you see the IP, you can test it:

$ curl http://203.0.113.16
The time is 7:01 PM, UTC.

3.2.6 Interacting with the Deployment

You can exec into the container to run commands:

$ kubectl exec -it deploy/timeserver -- sh
# echo "Testing exec"
Testing exec

Enter exit to close.

You can run the command directly too without bringing up a shell:

$ kubectl exec -it deploy/timeserver -- echo "Testing exec"
Testing exec

It can be useful to copy files out of the Pod:

kubectl cp $POD_NAME:example.txt example.txt

Or to the Pod:

kubectl cp example.txt $POD_NAME:.

For example:

$ echo "blar" > example.txt
$ kubectl get pods
NAME                         READY   STATUS    RESTARTS   AGE
timeserver-f94cc5dd9-jntt4   1/1     Running   0          12m
timeserver-f94cc5dd9-swl4h   1/1     Running   0          12m
timeserver-f94cc5dd9-xwbnx   1/1     Running   0          12m

$ POD_NAME=timeserver-f94cc5dd9-jntt4
$ kubectl cp example.txt $POD_NAME:.
$ rm example.txt 
$ kubectl cp $POD_NAME:example.txt example.txt
$ cat example.txt 
blar
$

3.2.7 Updating your application

Update the deployment manifest with the new version.

Using the editor, change the version.

Cloud Editor

It should look like:

Apply the change:

$ cd Chapter03/3.2_DeployingToKubernetes/
$ kubectl apply -f deploy.yaml
deployment.apps/timeserver configured

Get the status:

$ kubectl get deploy
NAME         READY   UP-TO-DATE   AVAILABLE   AGE
timeserver   3/3     1            3           16m

Watch the rollout in realtime

watch -d kubectl get deploy

Or, printing the changes:

$ kubectl get deploy -w
NAME         READY   UP-TO-DATE   AVAILABLE   AGE
timeserver   3/3     2            3           17m
timeserver   4/3     2            4           18m
timeserver   3/3     2            3           18m
timeserver   3/3     3            3           18m

You can also get the deployment by name (important if you have many deployments)

kubectl get deploy $DEPLOYMENT_NAME

$ kubectl get deploy timeserver
NAME         READY   UP-TO-DATE   AVAILABLE   AGE
timeserver   3/3     3            3           19m

And view just the Pods from the Deployment by using their labels (useful when you have lots of Pods)

$ kubectl get pods --selector=pod=timeserver-pod
NAME                          READY   STATUS              RESTARTS   AGE
timeserver-7d778fcbc4-8cxwg   1/1     Running             0          2m5s
timeserver-f94cc5dd9-l56wg    1/1     Running             0          68s
timeserver-f94cc5dd9-qjksn    0/1     ContainerCreating   0          52s
timeserver-f94cc5dd9-xwbnx    1/1     Running             0          18m

3.2.8 Cleaning up

Delete the resources created so far:

$ kubectl delete deploy timeserver
deployment.apps "timeserver" deleted
$ kubectl delete service timeserver
service "timeserver" deleted
$ kubectl delete pod timeserver
pod "timeserver" deleted

Or, you can delete all Deployments, Services and Pods (but be careful!):

$ kubectl delete deploy,svc,pod --all
deployment.apps "timeserver" deleted
service "kubernetes" deleted
pod "timeserver-f94cc5dd9-9bm7q" deleted
pod "timeserver-f94cc5dd9-flcx5" deleted
pod "timeserver-f94cc5dd9-hcwjp" deleted
pod "timeserver-f94cc5dd9-qjksn" deleted

4.1.2 Adding a readiness probe

Add a readiness probe to avoid sending traffic to unready Pods.

Apply the change

cd Chapter04/4.1.2_Readiness
kubectl apply -f deploy.yaml

4.1.3 Adding a liveness probe

Add a liveness probe to restart broken Pods.

Apply the change

cd Chapter04/4.1.3_Liveness
kubectl apply -f deploy.yaml

4.2.1 Rolling update strategy

Rollout changes to Deployments gradually with rolling update.

Figure 4.2 Pod status during a rolling update. With this strategy, requests can be served by either the old or the new version of the app until the rollout is complete.

4.2.2 Re-create strategy

Rollout changes to Deployments with re-create.

Figure 4.3 A Pod status during a rollout with the re-create strategy. During this type of rollout, the app will experience a period of total downtime and a period of degraded capacity.

4.2.3 Blue/green strategy

Rolling out with blue/green.

Figure 4.4 A Pod status during a blue/green rollout. Unlike the previous strategies, there are two action points where other systems, potentially including human actors, make decisions.

Figure 4.5 A single Service alternates between two Deployments, each with a different version of the container.

To test, first setup:

$ cd Chapter04/4.2.3_BlueGreen/
$ ls
deploy-blue.yaml  deploy-green.yaml  service.yaml
$ kubectl create -f .
deployment.apps/timeserver-blue created
deployment.apps/timeserver-green created
service/timeserver created

$ nano service.yaml

Change blue to green

Nano

Apply the change

$ kubectl apply -f service.yaml 
service/timeserver configured

The cutover should be immediate.

5.1.1 Specifying Pod resources

Adding resource requests and limits.

5.2 Calculating Pod resources

Adding resource requests and limits.

Deploy this

kubectl apply -f Chapter05/5.2_ResourceUsageTest/deploy.yaml
kubectl create -f Chapter03/3.2_DeployingToKubernetes/service.yaml

Get the external IP

kubectl get svc
EXTERNAL_IP=203.0.113.16

Install Apache

sudo apt-get update
sudo apt-get install apache2 -y

Cloud Shell will warn you that the session is ephemeral. That’s OK, it just means you have to repeat this anytime you want to run ab in a new session.

With Apache installed, you can now run the load test

ab -n 10000 -c 20 http://$EXTERNAL_IP/

In another tab, view the usage data:

$ kubectl top pod
NAME                         CPU(cores)   MEMORY(bytes)   
timeserver-dd88988f5-nll5l   1m           11Mi

The values can be used as a baseline to understand what resources your container needs while under load.

6.1 Scaling Pods and nodes

Scaling Pods.

To scale Pods, update the replica count in the YAML and apply:

Or, imperatively with

kubectl scale deployment timeserver --replicas=6

Track the rollout:

$ kubectl get pods -w
NAME                         READY   STATUS    RESTARTS   AGE
timeserver-dd88988f5-4cshb   1/1     Running   0          42s
timeserver-dd88988f5-5b4r6   1/1     Running   0          84s
timeserver-dd88988f5-6czvt   0/1     Pending   0          42s
timeserver-dd88988f5-b6ph7   0/1     Pending   0          42s
timeserver-dd88988f5-m7xss   0/1     Pending   0          42s
timeserver-dd88988f5-qr94x   1/1     Running   0          42s

6.2 Horizontal Pod autoscaling

Autoscaling.

Create a HorizontalPodAutoscaler:

Or imperatively

kubectl autoscale deployment timeserver --cpu-percent=20 --min=1 --max=10

To test the HPA we need some requests that consume more CPU. Here’s an updated one:

And a new Deployment manifest with version 4.

To update your existing deployment:

$ cd Chapter06/6.2_HPA
$ kubectl apply -f deploy.yaml

Or, create everything from scratch

$ cd Chapter06/6.2_HPA
$ kubectl create -f deploy.yaml -f service.yaml -f hpa.yaml
deployment.apps/timeserver created
service/timeserver created
horizontalpodautoscaler.autoscaling/timeserver created

Get the external IP

$ kubectl get svc -w
NAME         TYPE           CLUSTER-IP       EXTERNAL-IP     PORT(S)        AGE
kubernetes   ClusterIP      34.118.224.1     <none>          443/TCP        79m
timeserver   LoadBalancer   34.118.238.176   35.233.155.58   80:32534/TCP   74m

Start a new load test

EXTERNAL_IP=203.0.113.16
ab -n 100000 -c 5 http://$EXTERNAL_IP/pi

(See 5.2 if you need to install ab.)

That will generate some load. Open some tabs and run

watch -d kubectl get deploy,hpa,pods

watch kubectl get deploy

and

kubectl top pods

kubectl top pods

You should observe Pods being created. Once Apache bench has stopped sending load, the reverse should happen. Pods typically scale down slower than they scale up.

7.1.2 Creating an internal service

Let’s deploy a Robot avatar creation service to use as the internal service.

The Deployment and Service look like so:

$ cd Chapter07/7.1_InternalServices/
$ kubectl create -f robohash-deploy.yaml
deployment.apps/robohash created
$ kubectl create -f robohash-service.yaml 
service/robohash-internal created

Wait for the Pods to become ready

kubectl get pods -w

Then try it out locally

kubectl port-forward service/robohash-internal 8080:80

Append a string to the URL like /william to generate a random robot

Example Robot (robot parts designed by Zikri Kader, assembled by Robohash.org, and licensed under CC-BY)

Next, we’ll use service discovery to use this internal service in our app.

7.1.3 Service discovery

Get the name of the internal service created in 7.1.2

$ kubectl get svc
NAME                TYPE           CLUSTER-IP       EXTERNAL-IP     PORT(S)        AGE
kubernetes          ClusterIP      34.118.224.1     <none>          443/TCP        98m
robohash-internal   ClusterIP      34.118.234.111   <none>          80/TCP         6m28s
timeserver          LoadBalancer   34.118.238.176   35.233.155.58   80:32534/TCP   94m

We can reference this in the Deployment

And add a new endpoint /avatar to generate a Random robot

Replace the Deployment:

$ cd Chapter07/7.1_InternalServices
$ kubectl replace -f  timeserver-deploy-dns.yaml

Create the service if needed

$ kubectl create -f timeserver-service.yaml
service/timeserver created

Wait for the new Pods to become ready

watch kubectl get deploy

Get the IP of our main service:

$ kubectl get svc/timeserver -w
NAME         TYPE           CLUSTER-IP       EXTERNAL-IP     PORT(S)        AGE
timeserver   LoadBalancer   34.118.238.176   203.0.113.16   80:32534/TCP   95m

Append /avatar to the IP and browse to it. Refresh the page to generate different robots.

8.1.1 Node selectors

Apply the change

kubectl apply -f Chapter08/8.1.1_NodeSelection/deploy_nodeselector-autopilot.yaml

Wait for the Pod. It will require a new node since we’re requring arm64.

watch -d kubectl get pods,nodes

Verify by describing the node which the Pod was assigned

$ kubectl get pods -o wide
NAME                          READY   STATUS    RESTARTS   AGE    IP              NODE                                  NOMINATED NODE   READINESS GATES
robohash-7b48845fc-2rz8k      1/1     Running   0          3m3s   10.64.129.9     gk3-my-cluster-pool-2-801f6e81-f6tm   <none>           <none>
timeserver-8669c964f8-x8mnt   1/1     Running   0          3m3s   10.64.129.221   gk3-my-cluster-pool-2-f010980d-kftw   <none>           <none>
$ kubectl describe node gk3-my-cluster-pool-2-f010980d-kftw  | grep arch
                    beta.kubernetes.io/arch=amd64
                    kubernetes.io/arch=amd64

If you see kubernetes.io/arch=amd64 it was assigned correctly.

9.1.2 Persistent volumes and claims

Figure 9.2 A Pod that references a PersistentVolumeClaim that gets bound to a PersistentVolume, which references a disk

Figure 9.3 The lifecycle of a PersistentVolumeClaim and PersistentVolume in a dynamically provisioned system

$ kubectl create -f Chapter09/9.1.2_PersistentVolume/pvc-mariadb.yaml
pod/mariadb-demo created
persistentvolumeclaim/mariadb-pv-claim created

Wait for the Pod, and the PVC to be created

$ kubectl get pvc -w
NAME               STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
mariadb-pv-claim   Pending                                      standard-rwo   36s
mariadb-pv-claim   Pending                                      standard-rwo   68s
mariadb-pv-claim   Pending                                      standard-rwo   68s
mariadb-pv-claim   Pending   pvc-45271397-f998-4ba4-9657-8df3fa0b1427   0                         standard-rwo   87s
mariadb-pv-claim   Bound     pvc-45271397-f998-4ba4-9657-8df3fa0b1427   2Gi        RWO            standard-rwo   87s

View the associate PV

$ kubectl get -o yaml pv pvc-45271397-f998-4ba4-9657-8df3fa0b1427
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: pd.csi.storage.gke.io
    volume.kubernetes.io/provisioner-deletion-secret-name: ""
    volume.kubernetes.io/provisioner-deletion-secret-namespace: ""
  creationTimestamp: "2023-11-12T21:12:46Z"
  finalizers:
  - kubernetes.io/pv-protection
  - external-attacher/pd-csi-storage-gke-io
  name: pvc-45271397-f998-4ba4-9657-8df3fa0b1427
  resourceVersion: "583596"
  uid: 9afc9344-16d1-4a25-8edb-bcc4a0df959f
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 2Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: mariadb-pv-claim
    namespace: default
    resourceVersion: "583340"
    uid: 45271397-f998-4ba4-9657-8df3fa0b1427
  csi:
    driver: pd.csi.storage.gke.io
    fsType: ext4
    volumeAttributes:
      storage.kubernetes.io/csiProvisionerIdentity: 1699782108094-9721-pd.csi.storage.gke.io
    volumeHandle: projects/gke-autopilot-test/zones/us-west1-b/disks/pvc-45271397-f998-4ba4-9657-8df3fa0b1427
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: topology.gke.io/zone
          operator: In
          values:
          - us-west1-b
  persistentVolumeReclaimPolicy: Delete
  storageClassName: standard-rwo
  volumeMode: Filesystem
status:
  phase: Bound

Figure 9.4 The PersistentVolumeClaim and PersistentVolume after the latter was provisioned, and they were bound together

Now the fun part, try it out:

kubectl port-forward pod/mariadb-demo 3306:3306

docker run --net=host -it --rm mariadb mariadb -h localhost -P 3306 \
-u root -p

The password, as shown in the deployment is your database password.

Run an example query:

MariaDB [(none)]> SELECT user, host FROM mysql.user;
+-------------+-----------+
| User        | Host      |
+-------------+-----------+
| root        | %         |
| healthcheck | 127.0.0.1 |
| healthcheck | ::1       |
| healthcheck | localhost |
| mariadb.sys | localhost |
| root        | localhost |
+-------------+-----------+
6 rows in set (0.007 sec)

MariaDB [(none)]> CREATE DATABASE foo;
Query OK, 1 row affected (0.004 sec)

MariaDB [(none)]> exit
Bye

Cleanup

kubectl delete pod/mariadb-demo
kubectl delete pvc/mariadb-pv-claim

9.2.1 Deploying StatefulSet

MariaDB

We can wrap the Pod from 9.1.2 into a StatefulSet:

kubectl create -f Chapter09/9.2.1_StatefulSet_MariaDB/mariadb-statefulset.yaml

$ kubectl get pvc
NAME                    STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
mariadb-pvc-mariadb-0   Bound    pvc-baaf6655-992e-48f0-8912-cfa28fd70cd8   2Gi        RWO            standard-rwo   7s

This time let’s test it with an ephemeral container running in the cluster, rather than locally via Docker.

kubectl run my -it --rm --restart=Never --pod-running-timeout=3m \
--image mariadb -- mariadb -h mariadb-0.mariadb-service -P 3306 -u root -p

Since it’s in the cluster it may take a moment to spin up, watch in a separate tab

$ kubectl get pod/my -w
NAME   READY   STATUS    RESTARTS   AGE
my     0/1     Pending   0          16s
my     0/1     Pending   0          57s
my     0/1     ContainerCreating   0          57s
my     1/1     Running             0          2m13s

When it’s Running, enter the password and test the DB like before

If you don't see a command prompt, try pressing enter.

It will show this, but don’t press enter (that sends the empty string as a password) instead, type your database password, as configured in the YAML.

If you don't see a command prompt, try pressing enter.

Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 3
Server version: 11.1.2-MariaDB-1:11.1.2+maria~ubu2204 mariadb.org binary distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> CREATE DATABASE foo;
Query OK, 1 row affected (0.002 sec)

MariaDB [(none)]> exit
Bye
pod "my" deleted

The ephemeral Pod is automatically deleted once you exit.

Redis

Another example is Redis. For this one we need to configure a file on the container, which we can do with a ConfigMap

And here’s the Redis StatefulSet that references the ConfigMap

Create both

kubectl create -f Chapter09/9.2.1_StatefulSet_Redis/

Once it’s ready, we can connect via the port-forward plus Docker method, like so:

$ k get pods -w
NAME                          READY   STATUS    RESTARTS   AGE
redis-0                       0/1     Pending   0          12s
redis-0                       0/1     Pending   0          78s
redis-0                       0/1     Pending   0          92s
redis-0                       0/1     ContainerCreating   0          92s
redis-0                       1/1     Running             0          2m8s

$ kubectl port-forward pod/redis-0 6379:637979
Forwarding from 127.0.0.1:6379 -> 6379
Handling connection for 6379

In a separate tab:

$ docker run --net=host -it --rm redis redis-cli
27.0.0.1:6379> INFO
# Server
redis_version:7.2.1
127.0.0.1:6379> exit

Note that Redis is configured with no password, as a purely internal service.

Cleanup

kubectl delete -f Chapter09/9.2.1_StatefulSet_MariaDB
kubectl delete -f Chapter09/9.2.1_StatefulSet_Redis

9.2.2 Deploying a multirole StatefulSet

And here’s the Redis StatefulSet that references the ConfigMap

kubectl create -f Chapter09/9.2.2_StatefulSet_Redis_Multi

This time, we’ll get 3 pods, the primary and 2 replicas:

$ kubectl get pods
NAME                          READY   STATUS    RESTARTS   AGE
redis-0                       1/1     Running   0          57s
redis-1                       1/1     Running   0          45s
redis-2                       0/1     Pending   0          26s

Instead of creating a local docker container, we can also just exec directly in to test.

Connecting to the primary, we can write data:

$ kubectl exec -it redis-0 -- redis-cli
127.0.0.1:6379> SET capital:australia "Canberra"
OK
127.0.0.1:6379> exit

Connecting to one of the replicas, you can read data

$ kubectl exec -it redis-1 -- redis-cli
Defaulted container "redis-container" out of: redis-container, init-redis (init)
127.0.0.1:6379> GET capital:australia
"Canberra"

But not write

127.0.0.1:6379> SET capital:usa "Washington"
(error) READONLY You can't write against a read only replica.
127.0.0.1:6379> exit

9.3 Migrating/recovering disks

Create the Redis StatefulSet from 9.2.2 if it’s not already running.

kubectl create -f Chapter09/9.2.2_StatefulSet_Redis_Multi

Add some data

kubectl exec -it redis-0 -- redis-cli

127.0.0.1:6379> SET capital:australia "Canberra"
OK
127.0.0.1:6379> SET capital:usa "Washington"
OK

$ cd Chapter09/9.2.2_StatefulSet_Redis_Multi/
$ kubectl delete -f redis-statefulset.yaml
service "redis-service" deleted
statefulset.apps "redis" deleted
$ kubectl create -f redis-statefulset.yaml
service/redis-service created
statefulset.apps/redis created

Once the Pods is ready

$ k get pods -w

You can see that the data is still there.

$ kubectl exec -it redis-0 -- redis-cli
127.0.0.1:6379> GET capital:usa
"Washington"

This test didn’t delete the PV or PVC though.

Deleting the PV

Update Reclaim policy

To recover data, the PV needs to be set to Retain. In production, that’s what you should always do. For this test, we will need to update that.

$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                       STORAGECLASS   REASON   AGE
pvc-40c8cdf9-f97d-4b7b-9beb-ef783b1425c5   1Gi        RWO            Delete           Bound    default/redis-pvc-redis-1   standard-rwo            3m42s
pvc-4d2a4af8-5a4a-4e9e-849d-7721b03e8fd5   1Gi        RWO            Delete           Bound    default/redis-pvc-redis-0   standard-rwo            5m32s
pvc-b25dc634-e9b3-433f-b600-109272331761   1Gi        RWO            Delete           Bound    default/redis-pvc-redis-2   standard-rwo            25s

Look for the PVC that is associated with the CLAIM named default/redis-pvc-redis-0

$ PV_NAME=pvc-4d2a4af8-5a4a-4e9e-849d-7721b03e8fd5
$ kubectl edit pv $PV_NAME

Find

  persistentVolumeReclaimPolicy: Delete

and change to

  persistentVolumeReclaimPolicy: Retain

Now it should look like this

$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                       STORAGECLASS   REASON   AGE
pvc-40c8cdf9-f97d-4b7b-9beb-ef783b1425c5   1Gi        RWO            Delete           Bound    default/redis-pvc-redis-1   standard-rwo            6m2s
pvc-4d2a4af8-5a4a-4e9e-849d-7721b03e8fd5   1Gi        RWO            Retain           Bound    default/redis-pvc-redis-0   standard-rwo            7m52s
pvc-b25dc634-e9b3-433f-b600-109272331761   1Gi        RWO            Delete           Bound    default/redis-pvc-redis-2   standard-rwo            2m45s

Save Objects

This isn’t strictly needed, but it avoids us needing to re-create all the config.

$ kubectl get pvc,pv
NAME                                      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/redis-pvc-redis-0   Bound    pvc-4d2a4af8-5a4a-4e9e-849d-7721b03e8fd5   1Gi        RWO            standard-rwo   8m21s
persistentvolumeclaim/redis-pvc-redis-1   Bound    pvc-40c8cdf9-f97d-4b7b-9beb-ef783b1425c5   1Gi        RWO            standard-rwo   8m8s
persistentvolumeclaim/redis-pvc-redis-2   Bound    pvc-b25dc634-e9b3-433f-b600-109272331761   1Gi        RWO            standard-rwo   4m48s

NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                       STORAGECLASS   REASON   AGE
persistentvolume/pvc-40c8cdf9-f97d-4b7b-9beb-ef783b1425c5   1Gi        RWO            Delete           Bound    default/redis-pvc-redis-1   standard-rwo            6m27s
persistentvolume/pvc-4d2a4af8-5a4a-4e9e-849d-7721b03e8fd5   1Gi        RWO            Retain           Bound    default/redis-pvc-redis-0   standard-rwo            8m17s
persistentvolume/pvc-b25dc634-e9b3-433f-b600-109272331761   1Gi        RWO            Delete           Bound    default/redis-pvc-redis-2   standard-rwo            3m10s

$ kubectl get -o yaml persistentvolumeclaim/redis-pvc-redis-0 > pvc.yaml
$ PV_NAME=pvc-4d2a4af8-5a4a-4e9e-849d-7721b03e8fd5
$ kubectl get -o yaml persistentvolume/$PV_NAME > pv.yaml

Delete

With the rreclaim policy set correctly, and the config saved, delete the objects.

kubectl delete statefulset,pvc,pv --all

Re-create

Edit pv.yaml and make 2 changes

Remove the uid field from the claimRef section (the claimRef is the pointer to the PVC; the problem is that the PVC’s uid has changed).
Set the storageClassName to the empty string "" (we’re manually provisioning and don’t want to use a storageClass).

e..g.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io
    volume.kubernetes.io/selected-node: gk3-my-cluster-pool-2-801f6e81-zm87
    volume.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io
  creationTimestamp: "2023-11-13T00:14:31Z"
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    app: redis-sts
  name: redis-pvc-redis-0
  namespace: default
  resourceVersion: "715859"
  uid: 4d2a4af8-5a4a-4e9e-849d-7721b03e8fd5
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: standard-rwo
  volumeMode: Filesystem
  volumeName: pvc-4d2a4af8-5a4a-4e9e-849d-7721b03e8fd5
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 1Gi
  phase: Bound

Edit the pvc.yaml

a Delete the annotation pv.kubernetes.io/bind-completed: “yes” (this PVC needs to be re-bound, and this annotation will prevent that).
Set the storageClassName to the empty string "" (same reason as the previous step).

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io
    volume.kubernetes.io/selected-node: gk3-my-cluster-pool-2-801f6e81-zm87
    volume.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io
  creationTimestamp: "2023-11-13T00:14:31Z"
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    app: redis-sts
  name: redis-pvc-redis-0
  namespace: default
  resourceVersion: "715859"
  uid: 4d2a4af8-5a4a-4e9e-849d-7721b03e8fd5
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: ""
  volumeMode: Filesystem
  volumeName: pvc-4d2a4af8-5a4a-4e9e-849d-7721b03e8fd5
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 1Gi
  phase: Bound

Unlike before, the PV and PVC are “prelinked” (referencing each other)

Now recreate

There should be no objects currently:

$ kubectl get sts,pv,pvc
No resources found

$ kubectl create -f pv.yaml
persistentvolume/pvc-4d2a4af8-5a4a-4e9e-849d-7721b03e8fd5 created

$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                       STORAGECLASS   REASON   AGE
pvc-4d2a4af8-5a4a-4e9e-849d-7721b03e8fd5   1Gi        RWO            Retain           Available   default/redis-pvc-redis-0                           5s

$ kubectl create -f pvc.yaml
persistentvolumeclaim/redis-pvc-redis-0 created

$ kubectl get pv,pvc
NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                       STORAGECLASS   REASON   AGE
persistentvolume/pvc-4d2a4af8-5a4a-4e9e-849d-7721b03e8fd5   1Gi        RWO            Retain           Bound    default/redis-pvc-redis-0                           14s

NAME                                      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/redis-pvc-redis-0   Bound    pvc-4d2a4af8-5a4a-4e9e-849d-7721b03e8fd5   1Gi        RWO                           5s

$ kubectl create -f redis-statefulset.yaml
statefulset.apps/redis created
Error from server (AlreadyExists): error when creating "redis-statefulset.yaml": services "redis-service" already exists
$

Wait for the Pods to become ready:

$ kubectl get pods
NAME                          READY   STATUS    RESTARTS   AGE
redis-0                       1/1     Running   0          2m49s
redis-1                       1/1     Running   0          2m36s
redis-2                       0/1     Pending   0          16s

Read back the data. We can use a read-replica (whose data we didn’t restore) rather than the primary, to also verify Redis’ own data replication.

$ kubectl exec -it redis-1 -- redis-cli
127.0.0.1:6379> GET capital:australia
"Canberra"