The Kubernetes Security Guide That Saved Production Clusters From Being Pwned

Hook

In 2018, Tesla's Kubernetes cluster was compromised through an unsecured dashboard, leading to cryptocurrency mining on their infrastructure. Most of these breaches exploit the same preventable misconfigurations.

Context

Kubernetes security is notoriously complex because it's not a single surface—it's a sprawling attack landscape spanning container runtimes, network policies, API servers, etcd databases, and the underlying OS. When Kubernetes emerged as the container orchestration standard around 2015-2017, most teams were focused on getting workloads running, not on the sophisticated threat models required for multi-tenant production clusters. Early adopters learned security lessons the hard way: exposed dashboards, overly-permissive RBAC policies, and kubelet APIs accessible from the internet became common attack vectors.

The freach/kubernetes-security-best-practice repository emerged from this collective pain, synthesizing real-world security incidents into an actionable guide. Unlike vendor-specific documentation or academic security papers, this community-maintained resource prioritizes recommendations by severity—critical issues that grant cluster-admin access down to low-priority hardening measures. With over 2,700 stars, it's become a go-to checklist for platform teams hardening production clusters, especially those who've inherited under-secured Kubernetes environments and need to triage fixes systematically.

Technical Insight

System architecture — auto-generated

The guide's architecture follows a defense-in-depth model, organizing security concerns into layers: operating system hardening, network topology, Kubernetes control plane configuration, and workload isolation. What makes it particularly valuable is the severity-based taxonomy—each recommendation is marked with emoji indicators (🔴 Critical, 🟠 High, 🟡 Medium, 🟢 Low) so teams can prioritize limited security engineering time effectively.

Start with the critical recommendations, which focus on attack vectors that provide immediate cluster compromise. One of the most dangerous misconfigurations involves the kubelet's read-write API on port 10250. If this port is accessible without proper authentication, attackers can execute arbitrary commands in any pod on that node:

# Example of kubelet exploitation (for educational purposes)
# If port 10250 is exposed and anonymous auth is enabled:
curl -k https://node-ip:10250/runningpods/ 
# Returns all running pods on the node

curl -k -XPOST "https://node-ip:10250/run/<namespace>/<pod>/<container>" \
  -d "cmd=cat /var/run/secrets/kubernetes.io/serviceaccount/token"
# Executes commands inside containers, extracting service account tokens

The guide emphasizes disabling anonymous authentication and ensuring the kubelet only accepts authenticated requests. In your kubelet configuration, you should set:

apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
authentication:
  anonymous:
    enabled: false
  webhook:
    enabled: true
    cacheTTL: 2m0s
  x509:
    clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
  mode: Webhook

Another critical area the guide addresses is RBAC configuration. Many clusters grant overly broad permissions, especially with the default service account. The guide recommends applying the principle of least privilege rigorously. Here's an example of a properly scoped role versus a dangerous one:

# ❌ DANGEROUS: Overly permissive role
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: developer-role-bad
rules:
- apiGroups: ["*"]
  resources: ["*"]
  verbs: ["*"]

# ✅ BETTER: Scoped to specific needs
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: developer-role-good
  namespace: my-app
rules:
- apiGroups: [""]
  resources: ["pods", "pods/log"]
  verbs: ["get", "list"]
- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["get", "list", "update"]

The guide also integrates tooling recommendations, particularly kube-bench from Aqua Security, which automates CIS Kubernetes Benchmark checks. This is where the documentation transcends being just a reading list—it points you toward automation that can validate your hardening work:

# Run kube-bench to validate your cluster against CIS benchmarks
kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml
kubectl logs -f job/kube-bench

# Example output highlights specific failures:
# [FAIL] 1.2.19 Ensure that the --audit-log-path argument is set
# [FAIL] 3.2.1 Ensure that a minimal audit policy is created

One aspect that sets this guide apart is its coverage of network-level security beyond Kubernetes abstractions. It emphasizes firewall rules and network segmentation before diving into NetworkPolicies. For cloud-managed Kubernetes, this means configuring security groups to restrict control plane access. For on-premises clusters, it means proper VLAN segmentation and firewall ACLs. The guide correctly identifies that Kubernetes-native security controls are meaningless if the underlying network allows direct access to etcd or API server ports from untrusted zones.

The defense-in-depth approach extends to pod-level security with detailed recommendations on PodSecurityPolicies (now deprecated in favor of PodSecurityStandards in K8s 1.25+) and securityContext settings. The guide recommends running containers as non-root users, dropping unnecessary capabilities, and making root filesystems read-only:

apiVersion: v1
kind: Pod
metadata:
  name: hardened-pod
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    fsGroup: 2000
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: app
    image: myapp:latest
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop:
        - ALL
    volumeMounts:
    - name: tmp
      mountPath: /tmp
  volumes:
  - name: tmp
    emptyDir: {}

This configuration prevents common container breakout scenarios by limiting what processes inside the container can do, even if they're compromised.

Gotcha

The biggest limitation is that this repository is purely documentation—there are no scripts, Helm charts, or automation tools to apply these recommendations. You're responsible for translating the advice into your specific environment, which can be substantial work. If you're managing dozens of clusters across multiple cloud providers, manually implementing each recommendation becomes a multi-sprint project without additional tooling layers.

Another critical issue is version drift. Kubernetes moves fast, and some recommendations reference deprecated features. For instance, the guide mentions insecure-port flags deprecated in Kubernetes 1.10, and PodSecurityPolicies were removed in 1.25 in favor of PodSecurityStandards. While the core principles remain valid, you'll need to cross-reference with current Kubernetes documentation to ensure you're not implementing deprecated patterns. The last major update to the repository was several years ago, so treat it as a solid foundation but not gospel—always validate recommendations against your Kubernetes version's documentation. This is especially important for managed Kubernetes services like GKE, EKS, and AKS, where some control plane configurations are abstracted away and certain recommendations simply aren't applicable or configurable.

Verdict

Use if: You're inheriting an under-secured Kubernetes cluster and need a prioritized checklist to systematically reduce risk, you're new to K8s security and want a practitioner-focused overview beyond official docs, or you're conducting security audits and need a reference framework for common misconfigurations. This guide excels as a mental model for defense-in-depth in Kubernetes environments and pairs well with automated scanning tools. Skip if: You need ready-to-deploy security automation rather than guidance—look at tools like Falco, OPA Gatekeeper, or managed security solutions instead. Also skip if you're exclusively on managed Kubernetes and need provider-specific hardening guides—AWS, Google, and Azure each publish their own security best practices that account for their managed control planes. Finally, if you're running Kubernetes 1.25+ and need current guidance on PodSecurityStandards or other recent features, supplement this guide heavily with official documentation.

The Kubernetes Security Guide That Saved Production Clusters From Being Pwned

The Kubernetes Security Guide That Saved Production Clusters From Being Pwned

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]

The Kubernetes Security Guide That Saved Production Clusters From Being Pwned

Hook

Context

Technical Insight

Gotcha

Verdict

// KNOWLEDGE GRAPH

// RELATED

Headroom: The Three-Layer Compression Stack That Makes LLM Context Windows 60% Cheaper

GSD Core: Why This Tool Spawns a Fresh AI Context for Every Coding Task

Chipotlai Max: Reverse-Engineering Corporate Chatbots for Free LLM Inference

Running Gemma-4 26B on DGX Spark: Why Speculative Decoding Falls Apart at Scale

Headroom: The Three-Layer Compression Stack That Makes LLM Context Windows 60% Cheaper

GSD Core: Why This Tool Spawns a Fresh AI Context for Every Coding Task

Chipotlai Max: Reverse-Engineering Corporate Chatbots for Free LLM Inference

// CODEBASE INTELLIGENCE

Best for

Skip when

[ SIMILAR REPOS ]