source
Kubernetes SPIRE deployment design
This directory replaces the Compose-era SPIRE topology with a Kubernetes design
This directory replaces the Compose-era SPIRE topology with a Kubernetes design that is credible in reviewer Q&A without claiming that the repo has already performed a production rollout.
Scope
server/: 3-replica SPIRE Server StatefulSet, services, config, RBAC, and PVCsagent/: node-local SPIRE Agent DaemonSet and Workload API socket exposurecsi-driver/: SPIFFE CSI driver for pod-mounted X.509 SVIDsregistration/: sample workload registrations for the Ardur demo frameworks
Design prerequisites
These manifests are intentionally incomplete as a real deploy bundle. The cluster operator still needs to provide:
- A reachable PostgreSQL instance and a Secret named
spire-server-datastoreinspire-systemwith keyconnection_string. - A Secret named
spire-upstream-cainspire-systemwith:tls.crt: upstream CA certificate or intermediate chaintls.key: upstream CA private keyca.crt: upstream root bundle
- Node labels and namespace labels that match the selectors described in
registration/ardur-workloads.yaml. - The SPIRE controller manager CRDs if you want to apply
ClusterSPIFFEIDobjects directly.
Install order
Apply the manifests in this order:
kubectl apply -f k8s/spire/server/serviceaccount.yaml
kubectl apply -f k8s/spire/server/rbac.yaml
kubectl apply -f k8s/spire/server/configmap.yaml
kubectl apply -f k8s/spire/server/service.yaml
kubectl apply -f k8s/spire/server/statefulset.yaml
kubectl apply -f k8s/spire/agent/serviceaccount.yaml
kubectl apply -f k8s/spire/agent/rbac.yaml
kubectl apply -f k8s/spire/agent/configmap.yaml
kubectl apply -f k8s/spire/agent/daemonset.yaml
kubectl apply -f k8s/spire/csi-driver/rbac.yaml
kubectl apply -f k8s/spire/csi-driver/daemonset.yaml
# Optional: only after spire-controller-manager is installed
kubectl apply -f k8s/spire/registration/ardur-workloads.yaml
Workload consumption model
The preferred workload integration is the SPIFFE CSI driver. A pod consumes it with an inline ephemeral volume similar to:
volumes:
- name: spiffe
csi:
driver: csi.spiffe.io
readOnly: true
volumeMounts:
- name: spiffe
mountPath: /var/run/secrets/spiffe.io
readOnly: true
That keeps the private key lifecycle tied to the pod lifecycle and avoids teaching every workload to talk directly to the Workload API socket.
Validation plan
After deploy, validate in this order:
- Confirm all three server pods are healthy and share the same datastore:
kubectl -n spire-system get pods -l app.kubernetes.io/name=spire-server
kubectl -n spire-system logs statefulset/spire-server --tail=50
- Confirm bundle publication is working:
kubectl -n spire-system get configmap spire-bundle -o yaml
kubectl -n spire-system exec spire-server-0 -- \
/opt/spire/bin/spire-server bundle show
- Confirm node attestation:
kubectl -n spire-system exec spire-server-0 -- \
/opt/spire/bin/spire-server entry show
kubectl -n spire-system exec spire-server-0 -- \
/opt/spire/bin/spire-server agent list
- Confirm workload API reachability on a node:
AGENT_POD=$(kubectl -n spire-system get pod -l app.kubernetes.io/name=spire-agent -o jsonpath='{.items[0].metadata.name}')
kubectl -n spire-system exec "$AGENT_POD" -- \
/opt/spire/bin/spire-agent api fetch -socketPath /run/spire/agent-sockets/agent.sock
- Confirm registration entries reconcile:
kubectl get clusterspiffeid
kubectl describe clusterspiffeid ardur-langchain
Trust bundle rotation walkthrough
The rotation path is:
- SPIRE server rotates or renews its intermediate CA from the disk-based upstream authority.
- The server updates its own bundle state in the shared datastore.
- The
k8sbundlenotifier rewrites thespire-bundleConfigMap inspire-system. - SPIRE agents consume the updated bundle and continue serving the current bundle over the Workload API.
- Workloads using the CSI driver see the renewed material through the mounted volume; workloads using the Workload API directly fetch the new bundle on their next read.
Rollback plan
If the deployment needs to be backed out:
- Remove registration CRs or delete the equivalent SPIRE entries first so workloads stop depending on new identities.
- Remove the CSI driver DaemonSet and
CSIDriverobject. - Remove the agent DaemonSet.
- Remove the server StatefulSet and services.
- Keep the PostgreSQL snapshot and the PVCs until you decide whether you are rolling forward again or discarding the trust domain state permanently.
Operational work before a real cluster rollout
- Replace the demo
ardur.demotrust domain andardur-democluster name. - Back the upstream CA Secret with your real PKI rotation process.
- Add PodDisruptionBudgets, NetworkPolicies, and secret delivery automation.
- Decide whether controller-managed
ClusterSPIFFEIDreconciliation or explicitspire-server entry createautomation is the production path.