Why I Built My Own Kubernetes Cluster

5 min read
infrastructure kubernetes

my cluster

I’ve been working as a platform engineer at Stack Overflow since 2022, designing platform components, writing documentation, and making developers’ lives easier. Recently, I moved to the infrastructure platform team. Now I’m in unfamiliar territory, trying to abstract away multi-cloud infrastructure details from other platform engineering teams.

For years, I worked with Kubernetes professionally, but mostly at a level that hid the infrastructure details. CIDR planning? Not my thing. DNS Zones? Already done. CNI? Things were already talking to each other. I could implement platform components, deploy applications, and troubleshoot pods. But now I’m working one level below, the landscape where I don’t have much experience.

I needed to learn infrastructure deeply. So I built my own Kubernetes cluster from scratch on Hetzner Cloud. Cost? About £50/month for a production-ready setup with high availability, monitoring, and GitOps. That’s less than just the control plane on AWS, GCP, or Azure.

Why I Needed My Own Cluster

I’ve been building distributed systems since 2015 when I started at Hepsiburada in Istanbul, one of the largest e-commerce sites in Europe. I led the team that introduced the marketplace platform there and brought containers to the company. Back then, Kubernetes was still in alpha, and I was amazed by it. A decade of this kind of work changes how you approach problems.

Whenever I start a side project, the architecture immediately becomes distributed. I hate when I need a background job processor or some functionality the web framework doesn’t offer. The monolith I promise myself won’t become a distributed system? It always does. I don’t think this is overengineering anymore. It’s just how I think now.

My Kubernetes and Distributed Systems Experience is a Curse

Take my current side project, snapbyte.dev. It’s a personal tech digest service that collects links from HN, /r/programming, lobste.rs, extracts content, categorizes it, generates summaries, and builds personalized digests based on user configuration. Started as a single Phoenix app, but quickly split into multiple services once I hit limitations with Elixir’s content extraction libraries.

For years at work, I’ve been shipping with Kubernetes. It’s my default. When I think about deploying, I think in pods and services. But for personal projects, there’s always been friction: I want to use the tools I’m comfortable with, but I can’t justify £100+ monthly for managed Kubernetes on a side project.

Then I started seeing Hetzner posts on Hacker News. This one in particular caught my attention. People were running production Kubernetes clusters on Hetzner for £100 per month. The same setup on AWS, GCP, or Azure would cost 5-10x that amount.

Suddenly, the math worked. Affordable Kubernetes plus infrastructure learning for my day job.

What I Wanted to Learn

I needed to fill specific knowledge gaps for my day job at the infrastructure platform team.

Network segmentation and CIDR planning: How do you actually design a VPC? What CIDR blocks should you allocate for nodes vs. pods vs. services? I’d seen /16 and /24 notations for years but never had to think about planning subnet ranges.

Certificate management: How does cert-manager work with Let’s Encrypt? What’s a DNS-01 challenge vs. HTTP-01? How do you automate wildcard certificates?

ArgoCD and ApplicationSets: Stack Overflow uses ArgoCD, but I only used FluxCD at previous companies like Redgate and TrueLayer. I wanted to understand ApplicationSets, and whether ArgoCD’s UI and app-of-apps pattern would be more intuitive for managing platform components.

I also wanted to work with Talos Linux. No SSH, no package manager, no traditional OS at all. Just YAML for managing everything. More YAML. Always more YAML.

What I Built

Here’s what I ended up with: a private VPC with no public IPs on cluster nodes, dual ingress controllers (public + private via Tailscale), and GitOps from day one. The NAT gateway is the only node exposed to the internet, with SSH blocked and only Tailscale ports open.

Architecture

Private VPC (10.0.0.0/16)

Subnet (10.0.128.0/24)

Kubernetes

General Workers (10.0.128.64/27)

Public apps

Private apps

Platform Workers (10.0.128.32/27)

Platform Components

Control Plane (10.0.128.16/28)

Tailnet

Internet

Internet

NAT Gateway - Public IP

Hetzner LB - External

Hetzner LB - Internal (10.0.128.3)

Key Decisions

IaC: The whole infrastructure is version controlled. terraform apply if I need more nodes.

NAT Gateway security: The only node with a public IP is the NAT gateway. SSH port (22) is blocked via firewall; only Tailscale ports are open. Subnet routes are advertised to my Tailnet from this server.

Talos: Used Talos Terraform provider to provision the cluster and setup a nodepool logic.

Private-only cluster nodes: No public IPs on any Kubernetes nodes. All access goes through Tailscale.

GitOps from day one: ArgoCD manages all platform components: cert-manager, external-secrets, otel-collector, etc. No more kubectl apply.

Multiple “nodepools”: Workloads and platform components are deployed into different nodes based on nodeSelector. Workloads go to purpose: general and platform components go to purpose: platform.

Dual ingress: Two separate ingress controllers, both with Hetzner LoadBalancers. External ingress gets a public IP for public-facing apps. Internal ingress gets a private IP for tools like ArgoCD, Grafana, and OpenWebUI. I use Cloudflare NS records to delegate the int subdomain to Hetzner’s nameservers, where A and CNAME records point to the internal load balancer’s private IP. Internal services are only accessible via Tailscale.

Cost Breakdown

About £50/month total:

  • 10x servers (NAT Gateway + control plane + workers): ~£35
  • 1x public ip: ~£0.5
  • 2x Hetzner Load Balancers: £10
  • 8x 10GB storage: £4
  • Bandwidth: 20TB included

Compare this to managed Kubernetes:

  • AWS EKS: £65/month for control plane alone
  • GKE/AKS: Similar pricing
  • Add nodes, load balancers, NAT gateway: £150-200/month minimum

The savings are real, but so is the learning curve.

What’s Next

I might write more about building this platform: the network design, Talos provisioning, Cilium setup, ArgoCD workflows, and the full cost breakdown. But for now, this covers why I built it and what the architecture looks like.

If you’re interested in running your own Kubernetes cluster affordably, or you’re a platform engineer wanting to understand infrastructure more deeply, hopefully this gives you some inspiration.