llama-stack-k8s-operator: Cache Architecture
Controller-runtime cache configuration controls which Kubernetes resources are cached in-memory. Misconfigured caches (cluster-wide watches on high-cardinality types without filters) are a primary cause of operator OOM kills.
Cache Architecture
Manager Configuration
Property
Value
Manager file
main.go
Cache scope
cluster-wide
DefaultTransform
yes
GOMEMLIMIT
800MiB
Memory limit
1Gi
Filtered Types
Type
Filter Kind
Filter
appsv1.Deployment
label
label selector
autoscalingv2.HorizontalPodAutoscaler
label
label selector
corev1.ConfigMap
label
label selector
corev1.PersistentVolumeClaim
label
label selector
corev1.Service
label
label selector
networkingv1.Ingress
label
label selector
networkingv1.NetworkPolicy
label
label selector
policyv1.PodDisruptionBudget
label
label selector
Issues
Type LlamaStackDistribution is watched but has no cache filter (cluster-wide informer)
April 22, 2026
April 22, 2026