Introduction to k8s

Denys Savchenko

APC - Université Paris Cité / IN2P3

March 6, 2025

Kubernetes vs Docker

  • Docker is a containerization platform and runtime. K8s is a container orchestration platform. k8s may use Docker as underlying container runtime (or containerd, CRI-O etc).
  • k8s pros:
    • coordinates and schedules containers across multiple servers
    • extensive API (kubectl cli, language bindings); imperative commands + declarative config
    • container grouping into pods
    • container health monitoring, self-healing
    • abstraction of networking, persistent volumes etc., cloud provider integrations
    • extensible, reach ecosystem
  • k8s cons: complicated

Cluster Architecture

Based on official kubernetes documentation

Control plane components

  • kube-apiserver exposes k8s API
  • etcd – consistent and highly-available key value store used as Kubernetes’ backing store for all cluster data
  • kube-scheduler – control plane component that watches for newly created Pods with no assigned node, and selects a node for them to run on
  • kube-controller-manager runs controller processes. Controllers – control loop that reconciles the state of a cluster with the desired one
  • cloud-controller-manager

Node components

  • kubelet – takes a set of PodSpecs that are provided through various mechanisms and ensures that the containers described in those PodSpecs are running and healthy
  • kube-proxy maintains network rules on nodes. These network rules allow network communication to your Pods from network sessions inside or outside of your cluster (optional if a network plugin implements functionality)
  • container runtime – containerd/CRI-O/Docker

Kubernetes workload resources

pods

Pods are basic workload of k8s. Represents several containers that work together.

  • containers in pod share storage and network resources
  • may wrap one container or several:
    • init containers
    • sidecar containers
    • ephemeral containers

Minimal pod definition example

simple-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx
    image: nginx:1.14.2
    ports:
    - containerPort: 80

Apply declarative definition imperatively

kubectl apply -f https://k8s.io/examples/pods/simple-pod.yaml

Note

Pods are rarely created directly. They are normally managed by workload resources.

Deployment

The most common way to run applications. Best fit for stateless, scalable applications.

  • Deployment rollsout ReplicaSet. It manages copies of pods in the background.
  • Each change in Deployment is saved in etcd as a revision
    • may rollback to earlier revision
  • Allows scaling up and down the number of pods (creates new ReplicaSet)

Example Deployment

nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80

Other workloads

  • StatefulSet is used to manage stateful applications (e.g DB)
    • sticky identity for pods (stable names / network identifiers)
    • ordered deployment/scaling
  • DaemonSet runs a copy of a Pod on every (or defined set of) node of a cluster. Used for daemons, add-ons, helper tools necessary for operation.
  • Jobs represent one-off task that runs to completion and then stop
  • CronJobs are scheduled Jobs

Kubernetes networking

  • Each Pod has unique cluster-wide IP (volatile if pod is restarted)
    • inside a Pod there is a local private network. Processes in different containers can communicate over localhost
  • Service API provides a stable long-lived IP address / hostname for aplication
    • ClusterIP
    • NodePort
    • LoadBalancer (if supported by cloud provider)
    • ExternalName
  • Ingress API (superseeded by Gateway) manages external HTTP/HTTPS access to the services
    • uses dedicated add-on (nginx/traefik)
    • load-balancing / SSL termination
    • understands URIs, hostnames / paths …

Service example

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  selector:
    app.kubernetes.io/name: MyApp
  ports:
    - protocol: TCP
      port: 80
      targetPort: 9376

Note

  • Matches pod label with selector
  • ClusterIP by default

Ingress

Ingress

Configuration resources

ConfigMap

ConfigMap for setting configuration data separately from application code

apiVersion: v1
kind: ConfigMap
metadata:
  name: game-demo
data:
  config-option0: value0
  config-option1: value1
binaryData:
  image-file.jpg: <base64 encoded>

Exposed to pod as:

  • environment variables
  • volume: dir / file mount

Secrets

Secrets are used to store sensitive configuration / data. Similar to ConfigMap otherwise.

Different types of Secrets, e.g.

  • Opaque default general-use for arbitrary data
  • kubernetes.io/ssh-auth ssh credentials
  • kubernetes.io/tls data for a TLS client or server

Warning

Kubernetes Secrets are, by default, stored unencrypted in the API server’s underlying data store (etcd). Anyone with API access can retrieve or modify a Secret, and so can anyone with access to etcd.

Additional configuration is needed to effectively protect Secrets in multi-user production deployments.

Storage

Volumes

  • populating a configuration file based on a ConfigMap or a Secret
  • providing some temporary scratch space for a pod
  • sharing a filesystem between two different containers in the same pod
  • sharing a filesystem between two different pods (even if those Pods run on different nodes)
  • durably storing data so that it stays available even if the Pod restarts or is replaced
  • passing configuration information to an app running in a container, based on details of the Pod the container is in (for example: telling a sidecar container what namespace the Pod is running in)
  • providing read-only access to data in a different container image

PersistentVolumeClaim

Volumes can be statically created by cluster admin. More often are dynamically provisioned

  • StorageClasses are pre-created by cluster admin. Determine mechanisms for PersistentVolume creation. Provisioner add-on creates volumes.
    • local-path / cloud-backed add-ons (e.g. cinder-csi with OpenStack) / nfs storage provider / Longhorn
  • User defines PersistentVolumeClaim asking to create a volume with specific StorageClass
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: claim1
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast
  resources:
    requests:
      storage: 30Gi