Every breach postmortem I've read in the last two years where Kubernetes was involved has the same root cause buried three paragraphs down: nobody in enterprise IT owned the cluster. A product team stood it up. A DevOps engineer wired it in. Security was a Jira ticket. Then someone left, the CNI version drifted, RBAC rotted, and a misconfigured service account held the keys to the namespace next door.

Kubernetes is not a developer platform. It's a distributed operating system with its own network, identity plane, secrets store, and scheduler. Treating it as "the thing the app team runs" is the same mistake enterprises made letting product teams manage their own AD forests in 2005. It ends the same way.

Cluster ownership: pick a model and write it down

There are three reasonable cluster ownership models. Pick one and commit:

  • Central platform team owns everything. Product teams get namespaces. Best for orgs under 50 services. Scales badly past that.
  • Platform team owns the substrate; product teams own namespaces and workloads. Platform provides a paved road — Helm charts, base images, policy bundles. This is where most enterprises belong.
  • Federated: product teams run their own clusters on a shared control plane (EKS, AKS, GKE). Only works if you have real platform engineering maturity and a hard inventory story.

The wrong answer is "it depends on the team" — that's how you end up with 14 clusters, three CNIs, and no one who can answer which one runs payroll.

Networking and identity: the parts IT actually owns

CNI choice is infrastructure. It is not a product team decision. Cilium, Calico, AWS VPC CNI, Azure CNI — each has different implications for network policy enforcement, eBPF observability, and egress cost. Pick based on how you answer who enforces east-west traffic policy, not on a blog post.

Service account sprawl is the modern equivalent of local admin accounts. Defaults are generous. A service account with cluster-admin bound to default in one namespace is a breach waiting for a compromised pod. Enterprise IT should be auditing:

  • Workload identity federation (IRSA on AWS, Workload Identity on GCP, Azure AD workload identity) — so pods don't hold long-lived credentials.
  • Secret stores external to the cluster — HashiCorp Vault, AWS Secrets Manager via the Secrets Store CSI driver. Not kubectl create secret.
  • Bound service account tokens with TTLs. The old "token lives forever in a secret" model is deprecated for a reason.

Policy and multi-tenancy: the parts everyone skips

If your cluster has more than one team on it, you need a policy engine. OPA Gatekeeper or Kyverno — doesn't matter which, pick one. Without admission control you have no way to prevent:

A team deploying privileged containers, mounting host paths, or running as root — because the platform lets them, and they're on a deadline.

Multi-tenancy in Kubernetes is harder than the marketing suggests. Namespaces are not a security boundary by themselves. Network policies default-deny, pod security standards enforced at restricted, ResourceQuotas, LimitRanges, and — for anything with real blast-radius concerns — virtual clusters (vcluster) or separate clusters entirely.

Observability: the bill you didn't budget for

The day you turn on full Prometheus scraping, Loki for logs, and Tempo for traces across a 200-node cluster, your observability bill doubles. I've watched this happen. The answer is not "turn it off" — it's cardinality discipline at the ingestion layer, tiered retention, and being honest that not every namespace needs 30 days of traces.

Enterprise IT needs to own the observability plane for clusters the same way it owns SIEM for endpoints. If product teams are each running their own Grafana, you have no operational view.

The takeaway: Kubernetes is infrastructure. Put it under the same ownership model as your hypervisors, your network fabric, and your identity plane. If you can't name the person on-call for the cluster running your most critical workload, you don't have a platform — you have a liability.