The Kubernetes Book
Notes
Chapter 1: Kubernetes Primer
My mental model for OCI before this book was it was a spec for containers and Docker, containerd where implementations of this spec. Couple of questions that I have this is what actually does “Runtime” mean for OCI. I didn’t know this initiative had anything to deal with runtime but I was thought it was more about building images. I also wonder how can one implementation of the spec be bloated compared to the other? How can one be better for K8s? Lots of questions I hope this book will clear up later on.Kubernetes 1.24 finally remove support for Docker as a runtime as it was bloated and overkill for what Kubernetes needed. Since then, most knew Kubernetes clusters ship with containerd as the default runtime. Fortunately, containerd is a stripped-down version of Docker optimized for Kubernetes that fully supports applications containerized by Docker. In fact, Docker, containerd, and Kubernetes all work with images and containers that implement the Open Container Initiatives (OCI) standards.
Chapter 2: Kubernetes principles of operation
Kubernetes can be both
- A cluster
- An orchestrator
Kubernetes cluster is one or more nodes providing CPU, memory and other resources for application use.
The two nodes types:
- Control plane nodes implement the K8s intelligence and every cluster needs at least one but it’s recommended to have 3 or 5 for HA. These nodes must run Linux.
- Worker nodes are where you run your applications
You can run applications on control plane nodes as well which is common for development and test environments.
Kubernetes: Orchestrator system that deploys and manages applications
The control plane is a collection of system services that implement the brains of Kubernetes. It exposes the API, schedules apps, implements self-healing, manages scaling operations, and more.
The API server is the front end of Kubernetes, and all commands and requests go through it. Even internal control plane services communicate with each other via the API server
The cluster store holds the desired state of all applications and cluster components, and it’s the only stateful part of the control plane.
It’s based on the etcd distributed database, and most Kubernetes clusters run an etcd replica on every control plane node for HA.
Kubernetes uses controllers to implement most of the cluster intelligence. Each controller runs as a process on the control plane, and some of the more common ones include:
- The Deployment controller
- The StatefulSet controller
- The ReplicaSet controller
Mental model, think of controller as thing that ensures the cluster runs what you asked it to run e.g. if you asked for replicas of you app, a controller will ensure 3 are healthy and take actions if they aren’t.
The controller manager is what manages the controller, a controller for the controller ha.
The scheduler watches the API server for new work tasks and assigns them to healthy worker nodes by
- Watching the API server for new tasks
- Identify capable nodes
- Assign tasks to nodes
- ❓ One thing I am confused about is what is an actual “node” is it a full VM? The book mentions if the cluster has node autoscaling it will trigger autoscaling event if there is a task it can’t find a suitable node for. When this new worker node spins up what exactly is spinning up here?
Kubelet is the main Kubernetes agent and handles all communication with the cluster
- Interesting, every worker node has one or more runtimes for pulling images and managing lifecycle operations such as starting and stopping containers. I would have thought this was a 1:1 mapping
Each worker node runs kube-proxy to implement cluster networking and load balance traffic to tasks running on the node.
Got it, so run worker node can have multiple tasks running. Still haven’t heard the term “pod” yet, I wonder if “task” and “pod” are equivalent?
Lol, the very next sentence just mentions it. Defined as
For now, think of Pods as a thin wrapper that abstracts different types of tasks so they can on Kubernetes. … Right now you only need to know two things: 1. Apps need to be wrapped in Pods to run on Kubernetes 2. Pods get wrapped in higher-level controllers for advanced features
- Based on the image provided in the book it appears you can have one or more containers within a pod. It’s almost like pod is a “docker compose lite”.
The important thing to understand is that each layer of wrapping adds something:
- The container wraps the app and provides dependencies
- The Pod wraps the container so it can run on Kubernetes
❓ I don’t quite get why there was a need to have a “Pod” concept to run on Kubernetes, I thought that was the whole point of containers?
Another way to think of what the controller does is the reconciliation between the observed state and the desired state
Manifest files in YAML is how you tell K8s what the desired state should look like.
Kubernetes does support the imperative model but declarative model is much more popular
Containers that are within the same pod communicate over
localhostBecause pods are immutable and the only will way to update them is by deleting the current one and adding a new one this causes IP churn and it can be hard for clients to reliable connect to individual pods. This is where Services come in, as they provide a stable DNS name, IP address , network port and load balance requests to the pod(s) behind it.
Chapter 3: Getting Kubernetes
Your kubeconifg file is called config and lives in your home directory’s hidden
.kubefolder. It defines:
- Clusters
- Users (credentials)
- Contexts
- Current context
The clusters section is a list of known Kubernetes clusters, the users section is a list of user credentials, and the contexts section is where you match clusters and credentials.
- Basic chapter that walks you through setting up K8s locally and on Linode.
Chapter 4: Working with Pods
- While pods can have multiple containers it appears to me pods are the smallest unit you operate at e.g. deploying, terminating, scaling, updating all happen at the pod level.
Pods abstract the workload details. This means you can run containers, VMs, serverless functions and Wasm apps inside Pods, and Kubernetes doesn’t know the difference.
- ❓ I don’t get this, I thought pods had to a OCI compliant image?
Containers and Wasm apps work with standard Pods, standard workload controllers, and standard runtimes. However, serverless functions and VMs need a bit of help. Serverless functions run in standard Pods but require apps like Knative to extend the Kubernetes API with customer resources and controllers. VMs are similar and need apps like KubeVirt to extend the API. VM workloads run in a VirtualMachineInstance (VMI) rather than a Pod.
- Question answered right in the next paragraph 😁
Pods run one or more containers, and all containers in the same pod share the pod’s execution environment. This includes:
- Shared filesystem and volumes (mnt namespace)
- Shared network stack (net namespace)
- Shared memory (IPC namespace)
- Shared process tree (pid namespace)
- Shared hostname (uts ) namespace
nodes are host servers that can be physical servers, VMs, or cloud instances. Pods wrap containers and execute on nodes
Kubernetes schedules all containers in the same Po to the same cluster node. Despite this you should only put containers in the same Pod if they need to share resources… If your only requirement is to schedule to workloads to the same node you can use one of the following options:
- nodeSelectors
- Affinity and anti-affinity
- Topology spread constraints
- Resource requests and resource limits
- Resource requests and resource limits tell the scheduler how much CPU and memory a pod needs.
- Restart policy is at the container level and doesn’t require creating and deleting a new pod. Pods never get restarted.
- ❓ I don’t understand what a workload resource is versus a controller
Every Kubernetes clusters runs a pod network and automatically connects all Pods to it. It’s usually a flat Layer-2 overlay network that spans all cluster nodes and allows every Pod to talk directly to every other pod.
Your pod network is implemented by a third-party plugin that interfaces with Kubernetes via the Container Network Interface (CNI).
Init containers are a special type of container defined in the Kubernetes API. You run them in the same Pod as application containers, but Kubernetes guarantees they’ll start and complete before the main app container starts. It also guarantees they’ll only run once.
Sidecar container adds functionality to an application without having to add it to the application container. Examples include scrape logs, monitor & sync remote content, broker connections, munge data, and encrypt network traffic.
Two ways to execute commands inside containers - Remote command execution lets you send commands to a container from you local shell - Exec session connects your local shell to the container’s shell
- Containers
HOSTNAMEwill match the name of the pod
Chapter 5: Virtual clusters with Namespaces
Namespaces are a way of dividing a Kubernetes clusters into multiple virtual clusters
- The predefined namespaces are
- default where new objects go if no namespace is specified.
- kube-system where control plane components go such as internal DNS and metrics server.
- kube-public for objects that need to be readable by anyone.
- kube-node-lease is used for node heartbeats and managing node leases.
Chapter 6: Kubernetes Deployments
Deployments follow the standard kubernetes architecture comprising:
- A resource which defined objects. Deployment resource exists in the
apps/v1API and defines all supported attributes and capabilities.- A controller which manage resources. Deployment controller is a control plane service that watches Deployments and reconciles observed state with desired state
I’m still a little fuzzy about this “resource” concept. Also, not exactly sure what an “object” is either. I think the most difficult part about learning k8s is all the jargon. My basic mental model for now which might not be correct is a resource is basically any value that can go into the
kindfield in the manifest and an object is a specific record of that type stored in the cluster.oh my god, another term “ReplicaSet” which is apparently managed by the Deployment. So the Deployment YAML creates a Deployment, ReplicaSet, and Pods. The Pods are managed by the ReplicaSet, which in turn, is managed by the Deployment. All management is done via the Deployment thought, you never interface with the ReplicaSet or Pods directly.
The various K8s autoscalers
- Horizontal Pod Autoscaler (HPA): adds and removes Pods to meet current demand.
- Cluster Autoscaler (CA): adds ands removes cluster nodes so you always have enough to run all scheduled Pods.
- Vertical Pod Autoscaler (VPA): increases and decreases the CPU and memory allocated to running Pods to meet current demand.
I can’t believe how simple it is to do rolling deployments on K8s. It’s actually what I wanted for pypacktrends but I did a bunch of bash/docker hacks to spin up new container update Caddy to point to new container name and remove the old one. It works but the K8s models is way cleaner.
Chapter 7: Kubernetes Services
- Label and Selectors I am also struggling with grokking. Conceptually I know it’s a way to name a Resource so that Deployments and Services know what to operate on but something isn’t clicking for me…. wait are labels what you put on the pods and selectors are what you put on other resources that operate on pods.
Whenever you create a Service, Kubernetes automatically creates an associated EndpointSlice to track healthy Pods with matching labels. The EndpointSlice controller automatically creates an associated EndpointSlice object. Kubernetes then watches the cluster for Pods that match the Service’s label selector. Any new Pods matching the select get added to the EndpointSlice, whereas any deleted Pods are removed. Applications send traffic to the Service name, and the applications’ container uses the cluster DNS to resolve the name to the Service’s IP address. The container then sends the traffic to the Service’s IP address, and the Service forwards it to one of the Pods listed in the EndpointSlice.
- Service Types:
- ClusterIP: most basic and provides reliable endpoint (name, IP, and port) on the internal Pod network. Only accessible from inside the cluster.
- NodePort: build on top of ClusterIP and allow clients to connect via a port on every cluster node.
- Use high-numbered ports between 30,000 - 32,767
- Clients need to know the names or IPs of nodes, as well as whether nodes are healthy.
- ❓ I thought EndpointSlice controller kept track of healthy nodes?
- ❓ I thought EndpointSlice controller kept track of healthy nodes?
- LoadBalancer: build on top of both and integrate with cloud load balancers for extremely simple access from the internet. It’s basically a NodePort Service fronted by a HA load balancer with a publicly resolvable DNS name and a low port number.
Session Affinity allows you to control session stickiness - whether or not client connections always go to the same Pod.
External Traffic Policy dictates whether traffic hitting the Service will be load balanced across Pods on all cluster nodes or just Pods on the node traffic arrives on.
- ❓ I am struggling with getting LB to work on local kind cluster
- Turns out I had to turn on the “Expose services to LAN” setting in OrbStack Kubernetes settings.
Chapter 8: Gateway API
Gateway API is the Kubernetes-native solution for routing external traffic to your Kubernetes applications. Gateway API is protocol-aware, already supporting HTTP, HTTPs, and gRPC, with TCP and UDP in the pipeline. It lets your route traffic to multiple applications via a single load balancer and does header matching, weighted traffic splitting, and more.
- ❓ I don’t get why do we need the Gateway API if we already have the LoadBalancer Service type?
Every Gateway API implementation installs three things:
- API resources: These include Gateways fro provisioning load balancers, HTTP Routes for adding routing logic, and GatewayClasses for connecting Gateways to controllers.
- Gateway Controller: Watches for new Gateways and deploys them.
- GatewayClass: Links Gateways to the right controller.
Gateways represent cluster entry points for external clients accessing Kubernetes applications.
- So maybe it’s different than Service LoadBalancers because it’s focus on the cluster level whereas services provide stable DNS, IP, Port for group of Pods???
- It almost seems like the L4 Cloud LB sits outside of Kubernetes Cluster routing traffic to Cluster Nodes.
Gateways represent load balancer instances. Every time you deploy a Gateway, you get a new load balancer.
- This has been probably the hardest chapter for me so far. I still confused about how the Gateway Controller, GatewayClass and Gateway all interact.
Chapter 9: Service discovery deep dive
Object names must be unique within a Namespace.
Pretty straightforward, Kubernetes runs a DNS server, coredns, on the control plane and whenever a new service gets added the DNS server gets updated that way other Pods in the Cluster can send requests to the Service name which will in turn get translated to the Service IP address by the DNS server. It’s a bit more complicated by that but it’s a good high level mental model for now.
Chpater 10: Kubernetes storage
Core storage related API objects:
- PersistentVolumes (PV): Make external volumes available
- PersistentVolumeClaims (PVC): If Pod wants to use a PV it needs a PVC to grant it access to a PV
- StorageClasses (SC): Allow applications to create PVs and backend volumes dynamically
- Pod needs 50GB volume and requests it via a PersistentVolumeClaim
- The PVC asks the StorageClass to create a new PV and associated volume on the AWS backend
- The SC makes the call to the AWS backend via the AWS CSI plugin
- The CSI plugin creates the 50GB EBS volume on AWS
- The CSI plugin reports the creation of the external volume back to the SC
- The SC creates the PV and maps it ot the EBS volume back to the SC
- The Pod mounts the PV and uses it
- Must have a 1:1 mapping between external volumes and PVs
Chapter 11: ConfigMap and Secrets
- ConfigMaps (CM) let you store non-sensitive configuration data outside of Pods and inject it at run time.
- Can inject into the container at runtime with either
- Environment Variables, when you create containers
- Arguments to the container’s startup command, when you create containers
- Files in a volume, automatically pushes updates to live containers
- Kubernetes does nothing to encrypt Secretes in the cluster store or while in file flight. For that you would typically use a Secrets Store CSI Driver for integrating with external vaults and have a service mesh that secures network traffic
Chapter 12: StatefulSets
- Used to deploy and managed stateful applications in Kubernetes
- StatefulSets can be compared to Deployments with the following features the Deployments do not do:
- Predictable and persistent Pod names and DNS names
- Predictable and persistent volume bindings
- Predictable startup and shutdown order
- First two form the Pod’s state and is referred to as the Pod’s stick ID.
- Another key difference from Deployments is StatefulSets create one Pod at a time and wait for it to be running before stating the next whereas Deployments use a ReplicaSet controller to start all Pods at the same time, which can result in race conditions
- Because StatefulSets already have a persistent pod name and DNS name, they don’t need a Service. Instead they use special kind of Service called a headless service, no ClusterIP. Primary purpose of of headless Services is to create DNS SRV records. Clients query DNS for individual Pods and then send queries directly to those Pods instead instead of via the ClusterIP.
Chapter 13: API security and RBAC
No notes.
Chapter 14: The Kubernetes API
- To extend the Kubernetes API you need two main things:
- Create your custom resource
- Write and deploy your customer controller
Chapter 15: Threat modeling Kubernetes
No notes.
Chapter 16: Real-world Kubernetes security
No notes.
Review
Great accessible intro into Kubernetes.
CLI cheat sheet
kubectl
kubectl configviewto view you kubeconfigget-contextsto see all contextsuse-context <context>to set contextcurrent-contextto see current context
kubectl getnodesto list all nodes in clusterpodsto list all pods<pod> -o yamlprovides the details of specific pod
kubectl explainpods --recursiveshows complete list of Pod attributes<attribute e.g. pod.spec.restartPolicy>to drill into specific pod attributes
kubectl apply-f pod.ymlto deploy resources
kubectl describepod <pod>to get overview of a pod
kubectl logs<pod>to get logs--container <container>to specify the container for multi-container pods (use the describe command to find container names)
kubectl exec<pod> -- psexample command-itfor connect your local shell to container’s shell
kubectl editallows you to edit a live Pod objectkubectl deletepod <pod> <pod>deletes pods
kind (Kubernetes in Docker)
kind create cluster --name book --config kind-3node.yamlcreate kind clusterkind delete cluster --name bookdelete a kind cluster