kserve: Dataflow¶
Controller Watches¶
Kubernetes resources this controller monitors for changes. Each watch triggers reconciliation when the watched resource is created, updated, or deleted.
Reconciliation Flow¶
How the controller interacts with the Kubernetes API during reconciliation.
sequenceDiagram
%% Static dataflow for kserve
participant KubernetesAPI as Kubernetes API
participant kserve_controller_manager as kserve-controller-manager
participant kserve_localmodel_controller_manager as kserve-localmodel-controller-manager
participant llmisvc_controller_manager as llmisvc-controller-manager
participant spark_pmml_iris as spark-pmml-iris
KubernetesAPI->>+kserve_controller_manager: Watch InferenceGraph (reconcile)
KubernetesAPI->>+kserve_controller_manager: Watch LocalModelCache (reconcile)
KubernetesAPI->>+kserve_controller_manager: Watch LocalModelNamespaceCache (reconcile)
KubernetesAPI->>+kserve_controller_manager: Watch LocalModelNode (reconcile)
KubernetesAPI->>+kserve_controller_manager: Watch TrainedModel (reconcile)
KubernetesAPI->>+kserve_controller_manager: Watch LLMInferenceService (reconcile)
KubernetesAPI->>+kserve_controller_manager: Watch InferenceService (reconcile)
kserve_controller_manager->>KubernetesAPI: Create/Update PersistentVolume
kserve_controller_manager->>KubernetesAPI: Create/Update PersistentVolumeClaim
kserve_controller_manager->>KubernetesAPI: Create/Update PersistentVolumeClaim
kserve_controller_manager->>KubernetesAPI: Create/Update Secret
kserve_controller_manager->>KubernetesAPI: Create/Update Service
kserve_controller_manager->>KubernetesAPI: Create/Update Service
kserve_controller_manager->>KubernetesAPI: Create/Update InferencePool
kserve_controller_manager->>KubernetesAPI: Create/Update VariantAutoscaling
kserve_controller_manager->>KubernetesAPI: Create/Update HTTPRoute
kserve_controller_manager->>KubernetesAPI: Create/Update HTTPRoute
kserve_controller_manager->>KubernetesAPI: Create/Update OpenTelemetryCollector
kserve_controller_manager->>KubernetesAPI: Create/Update InferencePool
kserve_controller_manager->>KubernetesAPI: Create/Update Deployment
kserve_controller_manager->>KubernetesAPI: Create/Update Deployment
kserve_controller_manager->>KubernetesAPI: Create/Update Deployment
kserve_controller_manager->>KubernetesAPI: Create/Update HorizontalPodAutoscaler
kserve_controller_manager->>KubernetesAPI: Create/Update Job
kserve_controller_manager->>KubernetesAPI: Create/Update ScaledObject
kserve_controller_manager->>KubernetesAPI: Create/Update ScaledObject
kserve_controller_manager->>KubernetesAPI: Create/Update LeaderWorkerSet
kserve_controller_manager->>KubernetesAPI: Create/Update PodMonitor
kserve_controller_manager->>KubernetesAPI: Create/Update ServiceMonitor
kserve_controller_manager->>KubernetesAPI: Create/Update Ingress
kserve_controller_manager->>KubernetesAPI: Create/Update Ingress
kserve_controller_manager->>KubernetesAPI: Create/Update VirtualService
kserve_controller_manager->>KubernetesAPI: Create/Update Route
kserve_controller_manager->>KubernetesAPI: Create/Update Service
kserve_controller_manager->>KubernetesAPI: Create/Update Service
KubernetesAPI-->>+kserve_controller_manager: Watch ConfigMap (informer)
KubernetesAPI-->>+kserve_controller_manager: Watch Node (informer)
KubernetesAPI-->>+kserve_controller_manager: Watch Node (informer)
KubernetesAPI-->>+kserve_controller_manager: Watch Pod (informer)
KubernetesAPI-->>+kserve_controller_manager: Watch Pod (informer)
KubernetesAPI-->>+kserve_controller_manager: Watch Gateway (informer)
KubernetesAPI-->>+kserve_controller_manager: Watch HTTPRoute (informer)
KubernetesAPI-->>+kserve_controller_manager: Watch ClusterServingRuntime (informer)
KubernetesAPI-->>+kserve_controller_manager: Watch LocalModelNode (informer)
KubernetesAPI-->>+kserve_controller_manager: Watch LocalModelNode (informer)
KubernetesAPI-->>+kserve_controller_manager: Watch ServingRuntime (informer)
KubernetesAPI-->>+kserve_controller_manager: Watch LLMInferenceServiceConfig (informer)
KubernetesAPI-->>+kserve_controller_manager: Watch InferenceService (informer)
KubernetesAPI-->>+kserve_controller_manager: Watch InferenceService (informer)
Note over kserve_controller_manager: Exposed Services
Note right of kserve_controller_manager: kserve-controller-manager-service:8443/TCP []
Note right of kserve_controller_manager: kserve-webhook-server-service:443/TCP []
Note right of kserve_controller_manager: llmisvc-controller-manager-service:8443/TCP [https]
Note right of kserve_controller_manager: llmisvc-webhook-server-service:443/TCP [https]
Note right of kserve_controller_manager: localmodel-webhook-server-service:443/TCP []
Note right of kserve_controller_manager: webhook-service:443/TCP [https]
Note over KubernetesAPI: Defined CRDs
Note right of KubernetesAPI: ClusterServingRuntime (serving.kserve.io/v1alpha1)
Note right of KubernetesAPI: ClusterStorageContainer (serving.kserve.io/v1alpha1)
Note right of KubernetesAPI: InferenceGraph (serving.kserve.io/v1alpha1)
Note right of KubernetesAPI: LLMInferenceService (serving.kserve.io/v1alpha1)
Note right of KubernetesAPI: LLMInferenceServiceConfig (serving.kserve.io/v1alpha1)
Note right of KubernetesAPI: LocalModelCache (serving.kserve.io/v1alpha1)
Note right of KubernetesAPI: LocalModelNamespaceCache (serving.kserve.io/v1alpha1)
Note right of KubernetesAPI: LocalModelNode (serving.kserve.io/v1alpha1)
Note right of KubernetesAPI: LocalModelNodeGroup (serving.kserve.io/v1alpha1)
Note right of KubernetesAPI: ServingRuntime (serving.kserve.io/v1alpha1)
Note right of KubernetesAPI: TrainedModel (serving.kserve.io/v1alpha1)
Note right of KubernetesAPI: LLMInferenceService (serving.kserve.io/v1alpha2)
Note right of KubernetesAPI: LLMInferenceServiceConfig (serving.kserve.io/v1alpha2)
Note right of KubernetesAPI: InferenceService (serving.kserve.io/v1beta1)
Webhooks¶
| Name | Type | Path | Failure Policy | Service | Source |
|---|---|---|---|---|---|
| clusterservingruntime.kserve-webhook-server.validator | validating | /validate-serving-kserve-io-v1alpha1-clusterservingruntime | Fail | $(kserveNamespace)/$(webhookServiceName) | config/webhook/manifests.yaml |
| inferencegraph.kserve-webhook-server.validator | validating | /validate-serving-kserve-io-v1alpha1-inferencegraph | Fail | $(kserveNamespace)/$(webhookServiceName) | config/webhook/manifests.yaml |
| inferenceservice.kserve-webhook-server.defaulter | mutating | /mutate-serving-kserve-io-v1beta1-inferenceservice | Fail | $(kserveNamespace)/$(webhookServiceName) | config/webhook/manifests.yaml |
| inferenceservice.kserve-webhook-server.pod-mutator | mutating | /mutate-pods | Fail | $(kserveNamespace)/$(webhookServiceName) | config/webhook/manifests.yaml |
| inferenceservice.kserve-webhook-server.validator | validating | /validate-serving-kserve-io-v1beta1-inferenceservice | Fail | $(kserveNamespace)/$(webhookServiceName) | config/webhook/manifests.yaml |
| localmodelcache.kserve-webhook-server.validator | validating | config/localmodels/webhook_cainjection_patch.yaml |
|||
| servingruntime.kserve-webhook-server.validator | validating | /validate-serving-kserve-io-v1alpha1-servingruntime | Fail | $(kserveNamespace)/$(webhookServiceName) | config/webhook/manifests.yaml |
| trainedmodel.kserve-webhook-server.validator | validating | /validate-serving-kserve-io-v1alpha1-trainedmodel | Fail | $(kserveNamespace)/$(webhookServiceName) | config/webhook/manifests.yaml |
HTTP Endpoints¶
| Method | Path | Source |
|---|---|---|
| * | / | cmd/router/main.go:671 |
| POST | /ensemble | docs/samples/graph/bgtest/bgtest/main.go:29 |
| POST | /single | docs/samples/graph/bgtest/bgtest/main.go:28 |
| POST | /splitter | docs/samples/graph/bgtest/bgtest/main.go:26 |
| POST | /switch | docs/samples/graph/bgtest/bgtest/main.go:27 |
| * | gateway.networking.k8s.io | pkg/controller/v1alpha2/llmisvc/config_merge.go:375 |
| * | gateway.networking.k8s.io | pkg/controller/v1alpha2/llmisvc/fixture/gwapi_builders.go:210 |
| * | gateway.networking.k8s.io | pkg/controller/v1alpha2/llmisvc/fixture/gwapi_builders.go:228 |
| * | gateway.networking.k8s.io | pkg/controller/v1alpha2/llmisvc/fixture/gwapi_builders.go:398 |
| * | gateway.networking.k8s.io | pkg/controller/v1alpha2/llmisvc/fixture/gwapi_builders.go:700 |
| * | inference.networking.k8s.io | pkg/controller/v1alpha2/llmisvc/fixture/gwapi_builders.go:290 |
| * | inference.networking.x-k8s.io | pkg/controller/v1alpha2/llmisvc/fixture/gwapi_builders.go:304 |
Configuration¶
ConfigMaps and Helm values that control this component's runtime behavior.
ConfigMaps¶
| Name | Data Keys | Source |
|---|---|---|
| inferenceservice-config | agent, autoscaler, batcher, credentials, deploy, explainers, inferenceService, ingress, localModel, logger, metricsAggregator, opentelemetryCollector, router, security, service, storageInitializer | charts/_common/common-patches/configmap-patch.yaml |
| inferenceservice-config | agent, autoscaler, batcher, credentials, deploy, explainers, inferenceService, ingress, localModel, logger, metricsAggregator, opentelemetryCollector, router, security, service, storageInitializer | charts/kserve-llmisvc-resources/files/common/configmap-patch.yaml |
| inferenceservice-config | _example, agent, autoscaler, batcher, credentials, deploy, explainers, inferenceService, ingress, localModel, logger, metricsAggregator, opentelemetryCollector, router, security, storageInitializer | charts/kserve-llmisvc-resources/files/common/configmap.yaml |
| inferenceservice-config | agent, autoscaler, batcher, credentials, deploy, explainers, inferenceService, ingress, localModel, logger, metricsAggregator, opentelemetryCollector, router, security, service, storageInitializer | charts/kserve-resources/files/common/configmap-patch.yaml |
| inferenceservice-config | _example, agent, autoscaler, batcher, credentials, deploy, explainers, inferenceService, ingress, localModel, logger, metricsAggregator, opentelemetryCollector, router, security, storageInitializer | charts/kserve-resources/files/common/configmap.yaml |
Helm¶
Chart: kserve-crd vv0.17.0