Kubernetes-From A Developers’s Point Of View

Ryan Zheng
10 min readFeb 11, 2023

This tutorial aims to pull several Kubernetes concepts like Pod, Volume, Deployment, Service, Ingress together, and practice these configurations on Minikube locally.

Kubernetes General

Kubernetes is said to be a container orchestration tool. But container on Linux is just a process similar to other processes.

We can initially regard Kubernetes as simply a process manager, which will significantly simplify our understanding.

As a process manager, there are mainly three requirements.

  • Start a process for the user
  • Stop a process for the user
  • Monitor a process. When the process is down, then restart the process automatically.

Step 1: Start a process for the user

To start a process for a user, the user should tell us the start-up command.

Step 2: Stop a process for the user

To stop the process, we can just send a SIGTERM to the process given the process ID from step 1.

Step 3: Monitor a process

On Linux, given a process id, there is a corresponding entry /proc/pid. We can use the existence of this entry to verify whether a process is live or not. When this entry is gone, we can run step 1 again.

Wrapping In A Server Application

We can wrap the above functionalities in a server application, let’s call it kubelet

As a server application, we need to expose API so that users can use our server to help them achieve the previous functionalities.

Let’s just use the rest API for the server. On the backend, we need to design the data structure to represent the process information that users send to us. Let’s call it PodTemplateSpec. (Don’t ask me why it's called PodTemplateSpec 😜. All data structure names are from the Kubernetes Github source code https://github.com/kubernetes/api/blob/master/core/v1/types.go)

PodTemplateSpec Data Structure

This data structure must specify the following information

  1. What is the executable, and where to find this executable
  2. The startup command for this executable

We also need another data structure to represent a process that is created from the PodTemplateSpec. Let’s call it Pod.

A process can have two attributes

  1. pid
  2. status(running or not running)

Starting Multiple Processes

Let’s not limit ourselves to starting only one process for the user in the PodTemplateSpec. We give the user opportunity to specify multiple executables.

Now we need one additional data structure to contain the "executableLocation”and "startup”attributes. Let’s call it Container. The PodTemplateSpec will contain a list of containers.

We also need to adjust our Pod data structure. Let’s create a class called ContainerStatus and move the "pid”and "status”to ContainerStatus class. Pod should contain a list of ContainerStatus.

Go Parallel

It’s a common thing that sometimes we need to run multiple copies of the same processes for parallel tasks. Let’s step further to enable the user to be able to create multiple copies of the same Pod.

We can abstract another data structure called Deployment. Each time the user does the deployment, they can specify how many replicas to run. So the Deployment class should contain "podTemplateSpec”and "replicas”attributes. We can put the list of running pods in the Deployment class to represent multiple running replicas.

Now the user just needs to send us a Deployment JSON, then we will create the multiple replicated Pods for them.

Load Balancing

With multiple duplicated Pods, we also need to do some load balancing by delegating the work to different Pods(That's what parallelism is all about). LoadBalancing is just some algorithm such as round-robin, priority-queue, etc to do work dispatching. We can provide different algorithms to users, so users can specify how the work schedule can be done on the duplicated pods.

Let’s just create another LoadBalancer data type, using the Deployment as one attribute.

Coming To The Container World

We know that a container is just a normal process on Linux similar to other processes. Given the docker image, we can use docker runcommand to start the container conveniently.

docker runcommand is also just sending a REST request to the docker daemon server. So the docker daemon will start the container. There are examples of how to use Docker REST API https://docs.docker.com/engine/api/sdk/examples/.

We can adjust our imagined kubelet server implementation to call docker APIs to start different containers for the user.

In order to start a docker container, the image id must be given. There are also other parameters that are given based on the usage of the container.

Basically, we need to support all the parameters that are provided by Docker in the REST API. But to make it simplified, we will only put image, ports, volumesparameters for the container.

(from chatgpt)

Adjustment Of Data Structure Based On Docker

According to the docker-compose configuration from the above picture, we need to adjust our Container class to include image, ports, volumesattributes.

In Docker, a Volume is a separate object which is created first, then referenced. We should also create Volume and VolumeMappingclasses.

The whole data structure for Deployment will look like.

Does it look similar to the Kubernetes Deployment YAML configuration? The whole point of this deduction is only trying to understand how Kubernetes works under the hood and make sense of the YAML configurations.

Kubernetes Practice On Minikube

The practice contains the following items.

  1. environment setup
  2. building a simple flask app docker image with commit hash as the image tag
  3. create a PersistentVolume and VolumeClaim using hostPath
  4. create a Deployment configuration using the image and volume claim
  5. create a Service configuration using the previous deployment
  6. create an Ingress configuration using myflask-service.comdomain
  7. see Rolling-Update of Pod in action

Environment Setup

We use Minikube on local system. Just follow the instructions https://minikube.sigs.k8s.io/docs/start/ here.

Minikube is just another docker container that internally runs another docker daemon, and also the Kubernetes components.

We can see what’s inside the Minikube docker container.

minikube ssh
docker ps

Building flask app

The business of the app writes the execution time of the function heavy_compute()to file /tmp/test.txt, and returns the time as the response to the user.

from flask import Flask
import random
import itertools
import math
random.seed(10)

def heavy_compute():
list1=[random.randint(1, 100) for i in range(10000)]
list2=[random.randint(1, 10) for i in range(100)]

two_lists=[list1,list2]
permutation = itertools.product(*two_lists) # I obtain the permutation ty Cyttorak
result=[math.factorial(x[0]+x[1]) for x in permutation] # The complex operation (factorial of the sum)
return result

app = Flask(__name__)

@app.route("/")
def entry():
file_name = "/tmp/test.txt"
try:
with open(file_name, "w+") as f:
import time
start_time = time.time()
heavy_compute()
time_spent = time.time() - start_time
response = f"Time spent in compute() function: {time_spent} seconds"
f.write(response)
return response + "\n"
except Exception as e:
return "open file failed"

Dockerfile

FROM python:3.8.2
ENV FLASK_APP "/app.py"
RUN pip install Flask
RUN pip install --upgrade pip
ADD app.py app.py
ENTRYPOINT ["flask", "run", "--host=0.0.0.0"]

build.sh

build the docker image with commit hash as the tag, and push the image to Minikube using the following commands.

eval $(minikube docker-env)

COMMIT_HASH=$(git rev-parse HEAD)

# Build the Docker image with the Git commit hash as part of the tag
docker build -t myflask:$COMMIT_HASH . && docker image prune

use docker image ls | grep flask to verify the image is created successfully.

Create A PersistentVolume And A PersistentVolumeClaim

First, mount the data folder from localhost to Minikube docker container.

minikube mount $(pwd)/app/data:/flaskapp

Now create a PersistentVolume and VolumeClaim using hostPath. Here, the hostPath is /flaskappwhich exists inside Minikube.

apiVersion: v1
kind: PersistentVolume
metadata:
name: minikube-hostvolume
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /flaskapp
persistentVolumeReclaimPolicy: Retain

---

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: minikuber-hostvolume-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi

The important thing is “storageClassName: manual” which should be set for both PersistentVolume and PersistentVolumeClaim. Refer to https://kubernetes.io/docs/tasks/configure-pod-container/configure-persistent-volume-storage/ for why.

use kubectl get pv, kubectl get pvc to verify PersistentVolume and PersistentVolumeClaim are created correctly.

Create A Deployment YAML Configuration

We will use the built image myflask:34fd7bea72ab062aceed5efbb35149986861fed6 and the PersistentVolumeClaim minikube-hostvolume-claim for the configuration of the PodTemplateSpec(remember the data structure we created previously?)

apiVersion: apps/v1
kind: Deployment
metadata:
name: myflask
labels:
app: myflask
spec:
replicas: 2
selector:
matchLabels:
app: myflask
template:
metadata:
labels:
app: myflask
spec:
containers:
- name: myflask
image: myflask:34fd7bea72ab062aceed5efbb35149986861fed6
imagePullPolicy: Always
ports:
- containerPort: 5000
volumeMounts:
- name: minikube-hostvolume
mountPath: /tmp
volumes:
- name: minikube-hostvolume
persistentVolumeClaim:
claimName: minikuber-hostvolume-claim
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1

use kubectl get deployments, kubectl get podsto verify the Deployment and Pods are created successfully.

Create A Service Configuration

As we mentioned earlier, when there are multiple duplicated Pods, some LoadBalancing is needed. Service is one object which does the LoadBalancing based on Pod IP addresses, as each Pod has its own IP.

apiVersion: v1
kind: Service
metadata:
name: myflask-service
spec:
type: ClusterIP
ports:
- name: 5000-5000
port: 5000
protocol: TCP
targetPort: 5000
selector:
app: myflask
sessionAffinity: None

The spec.typeattribute above for service can take ClusterIP, NodePort, and LoadBalancer. There are many articles explaining the differences of them.

We use ClusterIP here. With ClusterIP, our service will get assigned one IP address. We can use this IP address to access our service inside the Node.

run kube get services myflask-serviceto verify the service is created successfully.

We can access myflaskapp within Minikube using curlcommand.

Run minikube ssh to enter Minikube, then run curl 10.98.146.128:5000" We can see the output of the app.

The demo app also writes the response to/tmp/test.txtin the container which is mounted from minikube /flaskapp, which is again mounted from localhost $(pwd)/app/data . We can verify that the test.txt is created on localhost.

Create an Ingress Configuration

In production, we will use one domain address for our website. The users access our website by using the domain address. Traditionally, Nginx is the most widely used load-balancer to face the public network. The servers are deployed behind Nginx.

Kubernetes created this Ingress Controller concept. If you enable Ingress in Minikube usingminikube addons enable ingress , Minikube will automatically start one ingress controller container with Nginx functionality.

The second row contains nginx controller container with port 80 and 443

The Ingress configuration looks like follows:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: flask-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/access-logs: 'true'
spec:
rules:
- host: myflask-service.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: myflask-service
port:
number: 5000

It’s easy to relate that the Ingress configuration is trying to configure rules for route forwarding on Nginx.

With the above Ingress configuration, we enabled Nginx to expose myflask-service.comdomain. But this domain is only accessible inside Minikube.

We just need to add a mapping for Minikube IP address and the domain address in the host /etc/hostsfile.

Run sudo echo $(minikube ip) myflask-service.com >> /etc/hosts . Now open the myflask-service.com in the browser.

Great!! It works like a Charm!!

See Rolling Update In Action

The idea behind rolling updates is that when we change the source code, we build a new image. We just need to update the image id in the Deployment YAML file, then reapply the deployment. Kubernetes will create new Pods with new IP addresses, and replace the existing Pod one by one using the new Pods.

Update the app.pyto add one more line to the response.

I added one additional line to the app.py. Run the build.sh again to rebuild the image.

Update the deloyment.yaml to use the new image

Runkubectl apply -f deployment.yaml to reapply the deployment.

This process can be seen from KubeLens. A good UI tool for kubernentes.

From this picture, we can see that one pod is terminating, and another two new Pods are in a pending state.

That’s all for the fundamentals of Kubernetes. The complete source code in this article can be found in the GitHub repo as well https://github.com/ryan-zheng-teki/kubernetes-tutorial/tree/master/app

--

--

Ryan Zheng

I am a software developer who is keen to know how things work