K8s Best Security Practices —Tryhackmy —Writeup

Mohamed Ali

11 min readJul 30, 2024

Best Kubernetes security practices at a cluster level.

Make every campaign better than the last with GetResponse Email Marketing + Landing Pages + Marketing Automation — $49 only

GetResponse | Professional Email Marketing for Everyone

No matter your level of expertise, we have a solution for you. At GetResponse, it's email marketing done right. Start…

getresponse.com

Task1 Introduction

Learning Objectives

Understand the importance of implementing best security practices in Kubernetes
Understand ServiceAccounts in Kubernetes, their function and best security practices surrounding them, being able to independently define their own
Understand how to independently define roles and RoleBindings to implement RBAC on a cluster, and why this is needed
Understand API requests in Kubernetes

Task2 ServiceAccounts

What’s a (Kubernetes) ServiceAccount?

One of the most important security practices in Kubernetes is the efficient and secure implementation of access control. Service Accounts are a part of the Access Control puzzle you’ll need to complete to understand how to implement. “Service account” is a general term that you may be familiar with if you use other cloud technologies. In this task, we will define a Service account in the context of Kubernetes, a ServiceAccount object.

Service accounts can be thought of as digital identities or non-human accounts. In Kubernetes, this identity is used in a security context to associate an identity with a specific process. In other words, Kubernetes system components, application pods, or other entities, both inside and outside of the cluster, can use ServiceAccount credentials to identify as this ServiceAccount. From a security perspective, this means API authentication can take place or, as just mentioned, identity/access control can be implemented using these ServiceAccounts.

ServiceAccounts Vs Users

Here would be a good point to emphasise the “non-human account” aspect of a ServiceAccount and clear up any confusion regarding human users being authenticated/authorised into a cluster. Here is some information regarding ServiceAccounts and Users.

Task3 RBAC

Why RBAC?

Part of best security practices is preparing for worst-case scenarios. You have securely built your Kubernetes cluster and followed best practices to harden that cluster, but you need to be prepared for scenarios where attackers gain access to the cluster. More specifically, the attacker will gain access to either a pod/application or a user; in other words, they are authenticated into the cluster. Both will have associated permission and be authorised to perform certain actions. It is then the job of a DevSecOps engineer to ensure that these permissions are configured with least privilege in mind.

For example, a pod which checks the running status in the “x” namespace should ONLY have the required permissions to perform that action and no other. This way, if an attacker were to gain access to this resource, they would only be able to perform authorised actions, which in this case would give them little access. We can implement this using RBAC (Role Based Access Controls).

Task4 API Requests in Kubernetes

Accessing the Kubernetes API

Accessing the Kubernetes cluster is done through the use of the Kubernetes API. This is our communication channel to the cluster, and there are different methods of accessing the Kubernetes API itself:

Kubectl — We’ve covered this before. Kubectl is Kubernetes’ command line tool. When accessing a cluster, you need to know two things: the cluster location and credentials. When configured, kubectl handles the locating and authenticating to the API server — making it easy to perform actions on the cluster using various kubectl commands. This is the most common method of accessing a Kubernetes cluster; however, if you wish to directly access the cluster yourself (using curl, wget or a browser), there are a couple of ways to do this, which will be discussed next.

Proxy — Kubectl can be run in “proxy mode” using the kubectl proxy command. This command will start a proxy server on port 8080 that will forward requests to the Kubernetes API server. This is the recommended method for directly accessing the Kubernetes API because it uses the stored API server location and verifies the identity of the API server using a self-signed certificate, meaning a man-in-the-middle (MITM) attack is not possible.

Auth Token — This method accesses the Kubernetes API directly by providing the HTTP client with the cluster location and credentials. For this reason, it is not recommended (as it would be vulnerable to a MITM attack).

Programmatic — Kubernetes supports programmatic access to the API using client libraries. It officially supports client libraries such as Python, Java, Javascript, Go, and some community-supported libraries (although these do not have official Kubernetes support).

Task5 More Best Security Practices

Image Scanning

Let’s start by taking things back a step. Why do we want Kubernetes in the first place? To deploy our apps, right? Well, those apps are made using images. The app image will contain everything needed for the app to run on the container, but what else does it contain? Image scanning is one way to find out! Images are generally made up of a series of commands (or Layers) that create directories/users , copy/run scripts, and install tools/libraries. All of these have security implications:

The tools, libraries and code we use in our app may come from an untrusted source. This can lead to backdoors in your application code (intentional or otherwise), giving an attacker access.
Official container images are often used as a base image (and built upon) when making an application. For example, if you are building a web application, you may use the Nginx base image, or if you’re building a database application, you may use any of the Postgres, Nginx, MongoDB, or Redis base images. These official container images are often considered safe because…well, they’re “official”. However, these official container images can contain vulnerabilities, and they frequently do. Of course, non-official images and outdated images can pose security concerns as well.
Because of the above-mentioned security implications, it is a best security practice to minimise attack surface in your image by using only the libraries an application NEEDS and using the most lightweight base image for its intended purpose.

Upgrading the Cluster

Keeping up to date is a general best security practice, not one specific to Kubernetes. Running an old version of Kubernetes comes with all the same security concerns; if an attacker finds a cluster running an older version node, they may know a few vulnerabilities that have since been patched. And, of course, updating a Kubernetes cluster comes with the same Stable vs Latest considerations. Stable releases have been thoroughly and fully tested (and so are generally favoured from a security perspective). However, the latest version (which will have been internally tested) will include patches for issues in the stable version. Ultimately, as long as you are at least using the stable version, you are following the best k8s security practices.

Note: When choosing the version to upgrade to (the latest stable version, for example), you cannot skip minor versions. For example, you cannot upgrade your cluster from version 1.21 -> 1.23 as you would be skipping minor version 1.22. So if the latest stable version were 1.23, you would have to do 1.21 -> 1.22, then 1.22 -> 1.23.

Upgrading a cluster usually falls to a DevSecOps engineer, or at the very least, a DevSecOps engineer will work alongside a DevOps engineer to ensure this is done. Due to the nature of versioning, there are constantly new versions being released, and keeping the cluster up to date can become quite a task. Because of this, teams sometimes overlook the upgrade process and fall behind on a few versions. Following best k8s security practices, your team will regularly assign you tickets to get the cluster upgraded. This will usually have to be done in the following order:

Upgrade (Primary) Control Plane Node (upgrading components in this order, etcd, kube-apiserver, kube-controller-manager,kube-scheduler, cloud controller manager)
If the cluster has additional control plane nodes, upgrade those
Upgrade worker nodes
Upgrade clients like kubectl
Make changes to manifests and resources based on changes in the new version (for example, if some of your resources are using a now depreciated feature, update them to use the replacement)

Here are some things to consider when upgrading the cluster:

More than one worker node can be upgraded at a time.
When upgrading a Kubernetes cluster, only internal components are touched. In other words, your apps and workloads should remain untouched and unharmed. However, it is always a good practice to take a backup of important components before the upgrade, just in case.
Nodes need to be drained; this way, pods are evicted from the cluster and workloads are unaffected.
Upgrading a cluster often comes with downtime. Depending on the importance of workloads being run in the cluster, this may lead to downtime for external services (internal or external). For this reason, a downtime notification with a maintenance window attached should be sent to the affected parties.

Non-root Container Running

Unless there is an operational need for it (which there almost never is), a container should be run as a non-root user. This is because if, like previously covered, your image has vulnerabilities in it and an attacker gains access to the container, you’re making their life a lot easier if that container has root privileges.

Secure etcd

As you may remember, etcd is Kubernetes’ key/value store where cluster data is kept. You hear “cluster data is kept”, an attacker hears “goldmine”, and they’re right. Changes made in the cluster (such as a pod being deleted) are then reflected in the etcd store. However, it goes both ways; if a change were made in the etcd (such as a new pod entry), that change would be reflected in the cluster. Imagine now an attacker gaining access to the etcd. Suddenly, they’re not so concerned with finding ways to gain API access to the cluster, but they’ve found a way to bypass the API altogether. An attacker gaining full access to the etcd is one of the worst things that could happen in a Kubernetes cluster, as it is essentially unlimited access.

Etcd can be isolated from the cluster and run separately, putting distance between the cluster and the etcd store should the cluster be breached.
Put etcd behind a firewall, ensuring it can only be accessed via the API (going through the appropriate API request stages).
Encrypt all etcd data, this would be done using either built-in encryption at rest mechanisms or an encryption provider. This way, if an attacker was able to access the etcd, they wouldn’t be able to decipher the contents.

Task6 Practical

Create a ServiceAccount called “pod-checker”

ubuntu@tryhackme:~$ kubectl create serviceaccount pod-checker --namespace test-chambers
serviceaccount/pod-checker created

2) Create a role named “pod-checker-role” which can “get” and “list” “pod” resources only

ubuntu@tryhackme:~$ nano pod-checker-role.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-checker-role
  namespace: test-chambers
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list"]

ubuntu@tryhackme:~$ kubectl apply -f pod-checker-role.yaml
role.rbac.authorization.k8s.io/pod-checker-role created

3) Create a RoleBinding which binds the “pod-checker-role” to the “pod-checker” ServiceAccount

Copy This and paste in File

ubuntu@tryhackme:~$ nano pod-checker-role-binding.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: pod-checker-role-binding
  namespace: test-chambers
subjects:
- kind: ServiceAccount
  name: pod-checker
  namespace: test-chambers
roleRef:
  kind: Role
  name: pod-checker-role
  apiGroup: rbac.authorization.k8s.io

ubuntu@tryhackme:~$ kubectl apply -f pod-checker-role-binding.yaml
rolebinding.rbac.authorization.k8s.io/pod-checker-role-binding created

4) Delete the pod-status-checker pod

ubuntu@tryhackme:~$ kubectl delete pod pod-status-checker -n test-chambers
pod "pod-status-checker" deleted

5) Make changes to the ~/Documents/pod-config/pod-checker.yaml config so the pod uses the pod-checker ServiceAccount instead of pod-admin, then apply this config

Change pod-admin to pod-checker

ubuntu@tryhackme:~/Documents$ cd pod-config/
ubuntu@tryhackme:~/Documents/pod-config$ ls
pod-checker.yaml
ubuntu@tryhackme:~/Documents/pod-config$ cp pod-checker.yaml ../.
ubuntu@tryhackme:~/Documents/pod-config$ cd ..
ubuntu@tryhackme:~/Documents$ ls
pod-checker.yaml  pod-config  test
ubuntu@tryhackme:~/Documents$ nano pod-checker.yaml

apiVersion: v1
kind: Pod
metadata:
  name: pod-status-checker
  namespace: test-chambers
spec:
  serviceAccountName: pod-admin
  containers:
  - name: status-checker
    image: bitnami/kubectl:latest
    imagePullPolicy: Never
    command:
    - "/bin/sh"
    - "-c"
    - |
      while true; do
        kubectl get pods -n test-chambers
        sleep 300
      done
  restartPolicy: Always

ubuntu@tryhackme:~/Documents$ kubectl apply -f pod-checker.yaml 
pod/pod-status-checker created

ubuntu@tryhackme:~/Documents$ kubectl get pods -n test-chambers
NAME                             READY   STATUS    RESTARTS       AGE
pod-status-checker               1/1     Running   0              113s
tc-deployment-8566d446c4-f2x5q   1/1     Running   1 (116m ago)   68d
tc-deployment-8566d446c4-fpz65   1/1     Running   1 (116m ago)   68d
tc-deployment-8566d446c4-nmspx   1/1     Running   1 (116m ago)   68d

Now Get Encode Output

ubuntu@tryhackme:~/Documents$ kubectl describe role pod-checker-role -n test-chambers

Paste the output into this base64 encoder; the encoded output is the answer!

Task 7 Conclusion

By going through this lesson, you’ve continued your DevSecOps journey, picking up some Kubernetes best security practices along the way. Slowly but surely, your knowledge of the tool and cluster architecture will start supporting your knowledge of security practices, so you know not just what to implement but why. Before ending this room, let’s go over what we’ve covered:

Kubernetes ServiceAccounts can be used to implement access control. In a security context, these ServiceAccounts tie an application/pod/service to a digital identity so access can be restricted based on this identity.
As mentioned above, RBAC is one method of restricting permissions based on identity. RBAC should be configured with the least privilege principle in mind across the cluster, considering the use case of each user/application before assigning a Role or ClusterRole.
When an API Request is made, it goes through the following request stages before being carried out: Authentication -> Authorisation -> Admission Controllers.
Vulnerabilities in an image your application is running can lead to your Kubernetes cluster being breached. Image Scanning should be introduced into the CI/CD pipeline to avoid this.
Upgrading a Kubernetes cluster ensures the cluster is running with the latest patches and features.
To limit an attacker’s ability to escalate privileges should they gain access to a container, containers should run with no root privileges.
An attacker gaining access to the etcd would be a catastrophe. For this reason, the etcd should be isolated from the cluster, behind a firewall, and encrypted.

With that, this room is complete! With this knowledge, your brain is now a more valuable resource to companies like Kubernetes Laboratories, which can be safeguarded against misconfigurations and attackers.

Find Me On Social Media

Mohamed Ali | Facebook | Linktree

Red Team Analyst

linktr.ee