Skip to content

Development Setup

This guide walks you through setting up a local development environment for Operator Chaos.

Prerequisites

Required Tools

  • Go 1.25 or laterInstall Go
  • Git — For cloning the repository
  • kubectlInstall kubectl
  • Access to a Kubernetes cluster — Kind, Minikube, or OpenShift

Optional Tools

Clone the Repository

git clone https://github.com/ugiordan/operator-chaos.git
cd operator-chaos

Verify Go Version

go version
# Should output: go version go1.25.x ...

If your Go version is older, update it before proceeding.

Install Dependencies

go mod download

This downloads all Go module dependencies defined in go.mod.

Build the Project

Build All Binaries

go build ./...

This compiles all packages and ensures there are no syntax or type errors.

Build the CLI

go build -o bin/operator-chaos ./cmd/operator-chaos

The binary will be placed in bin/operator-chaos.

Build the Controller

The controller is the same binary, started with controller start:

bin/operator-chaos controller start

Run Tests

Unit Tests

go test ./...

Run tests with verbose output:

go test -v ./...

Test Specific Packages

go test ./pkg/injection/...
go test ./pkg/observer/...

Run with Coverage

go test -coverprofile=coverage.out ./...
go tool cover -html=coverage.out -o coverage.html

Open coverage.html in a browser to view coverage reports.

Project Structure

operator-chaos/
├── api/v1alpha1/              # CRD types and validation
│   ├── types.go               # ChaosExperiment CRD definition
│   └── groupversion_info.go   # API group metadata
├── cmd/
│   ├── cli/                   # CLI entrypoint
│   └── controller/            # Controller entrypoint
├── pkg/
│   ├── injection/             # Injection engine
│   │   ├── engine.go          # Registry and Injector interface
│   │   ├── podkill.go         # PodKill implementation
│   │   ├── network.go         # NetworkPartition implementation
│   │   └── ...                # Other injectors
│   ├── observer/              # Observation system
│   │   ├── board.go           # Blackboard implementation
│   │   ├── contributor.go     # Contributor interface
│   │   └── ...                # Specific contributors
│   ├── orchestrator/          # Experiment orchestration
│   │   └── lifecycle.go       # Lifecycle state machine
│   ├── evaluator/             # Verdict computation
│   ├── reporter/              # Report generation
│   ├── safety/                # Blast radius and safety checks
│   ├── model/                 # Operator knowledge and dependency graph
│   └── sdk/                   # Go SDK for client-side chaos
│       ├── client.go          # ChaosClient wrapper
│       ├── types.go           # FaultConfig types
│       └── faults/            # Fault injection primitives
├── config/
│   ├── crd/                   # CRD manifests
│   ├── controller/            # Controller deployment manifests
│   └── samples/               # Example experiments
├── experiments/               # Additional experiment examples
├── site/                      # Documentation (MkDocs)
└── Makefile                   # Build automation

Running the CLI Locally

Run an Experiment

./bin/operator-chaos run experiments/podkill-basic.yaml

Validate an Experiment

./bin/operator-chaos validate experiments/podkill-basic.yaml

List Available Injection Types

./bin/operator-chaos types

Generate Report

./bin/operator-chaos run experiments/podkill-basic.yaml --report-dir=./reports

Reports are saved as JSON files in the specified directory.

Running the Controller Locally

1. Set Up a Local Cluster

Using kind

kind create cluster --name chaos-test

Using Minikube

minikube start --driver=docker

2. Install CRDs

kubectl apply -f config/crd/

Verify CRD installation:

kubectl get crd chaosexperiments.chaos.operatorchaos.io

3. Run Controller Locally

export KUBECONFIG=~/.kube/config
operator-chaos controller start

The controller will watch for ChaosExperiment resources and reconcile them.

Controller Logs:

INFO    controller-runtime.metrics    Metrics server is starting to listen
INFO    controller-runtime.builder    Starting EventSource
INFO    controller-runtime.builder    Starting Controller
INFO    controller-runtime.controller Starting workers

4. Submit an Experiment

In another terminal:

kubectl apply -f experiments/podkill-basic.yaml

Watch experiment progress:

kubectl get chaosexperiment podkill-basic -w

View experiment status:

kubectl describe chaosexperiment podkill-basic

Running the Dashboard Locally

The dashboard is a React + Vite web UI for viewing experiment results.

1. Install Node.js Dependencies

cd dashboard/ui
npm ci

2. Start Development Server

# Terminal 1: Run the Go backend
go run ./dashboard/cmd/dashboard/ -knowledge-dir knowledge/

# Terminal 2: Run the Vite dev server (with HMR)
cd dashboard/ui && npm run dev

The Vite dev server proxies /api/ requests to the Go backend (port 8080). The dashboard will be available at http://localhost:5173.

3. Build for Production

cd dashboard/ui && npm run build

This outputs to dashboard/ui-dist/, which is embedded into the Go binary via go:embed.

Code Quality Tools

Linting

golangci-lint run

Fix auto-fixable issues:

golangci-lint run --fix

Formatting

go fmt ./...

Vet

go vet ./...

Development Workflow

1. Create a Feature Branch

git checkout -b feature/my-new-feature

2. Make Changes

Edit code, add tests, update documentation.

3. Run Tests

go test ./...

4. Lint and Format

golangci-lint run
go fmt ./...

5. Commit Changes

git add .
git commit -m "feat: add new injection type"

6. Push and Open PR

git push origin feature/my-new-feature

Open a pull request on GitHub.

Testing Against a Real Cluster

1. Deploy Target Operators

Deploy the operators you want to test. For example, to test with OpenDataHub, follow the ODH installation guide.

2. Apply Operator Knowledge

kubectl apply -f config/knowledge/opendatahub-operators.yaml

3. Run Experiments

kubectl apply -f experiments/odh-controller-resilience.yaml

4. View Results

kubectl get chaosexperiment -A
kubectl get configmap -l app.kubernetes.io/managed-by=operator-chaos

Debugging

Enable Debug Logging

Set CHAOS_LOG_LEVEL=debug when running the controller or CLI:

export CHAOS_LOG_LEVEL=debug
operator-chaos controller start

Inspect Chaos-Managed Resources

List all resources managed by chaos:

kubectl get all -A -l chaos.operatorchaos.io/managed=true

View Rollback Annotations

kubectl get networkpolicy operator-chaos-np-app-redis -o yaml | grep -A 5 annotations

Controller Restart Recovery

To test crash-safe cleanup, kill the controller mid-experiment and restart it:

# Kill controller
pkill operator-chaos

# Restart
operator-chaos controller start

The controller should detect in-progress experiments and clean them up via Revert().

Common Issues

"CRD not found"

Solution: Install CRDs:

kubectl apply -f config/crd/

"Permission denied" errors from controller

Solution: Apply RBAC manifests:

kubectl apply -f config/controller/rbac.yaml

Experiments stuck in "Pending"

Solution: Check controller logs for validation errors:

kubectl logs -n default deploy/chaos-controller

TTL cleanup not working

Solution: Ensure the cleanup controller is running:

kubectl get pod -n default -l app=chaos-cleanup-controller

Next Steps