Skip to content

codeflare-operator: Cache Architecture

Controller-runtime cache configuration controls which Kubernetes resources are cached in-memory. Misconfigured caches (cluster-wide watches on high-cardinality types without filters) are a primary cause of operator OOM kills.

Cache Architecture

Manager Configuration

Property Value
Manager file main.go
Cache scope cluster-wide
DefaultTransform no
Memory limit 1Gi

Issues

  • No GOMEMLIMIT set in deployment (Go GC cannot pressure-tune)
  • No cache configuration: all informers are cluster-wide (OOM risk)
  • Type ClusterRoleBinding is watched but has no cache filter (cluster-wide informer)
  • Type Ingress is watched but has no cache filter (cluster-wide informer)
  • Type NetworkPolicy is watched but has no cache filter (cluster-wide informer)
  • Type RayCluster is watched but has no cache filter (cluster-wide informer)
  • Type Route is watched but has no cache filter (cluster-wide informer)
  • Type Secret is watched but has no cache filter (cluster-wide informer)
  • Type Service is watched but has no cache filter (cluster-wide informer)
  • Type ServiceAccount is watched but has no cache filter (cluster-wide informer)