Skip to content

Protocols

Protocols define the operational rules that govern agent behavior. They are enforced programmatically in multi-agent mode (Claude Code) and advisory in degraded mode (Cursor, AGENTS.md).

Input Isolation

File: protocols/input-isolation.md

Code under review is wrapped in unique delimiter blocks generated by generate-delimiters.sh. Each specialist receives its own delimited copy. This prevents:

  • Cross-agent output leakage (one specialist seeing another's findings)
  • Code injection via delimiter manipulation in the review target
  • Context confusion between code and agent instructions

Injection Resistance

File: protocols/injection-resistance.md

All agent outputs are scanned for injection patterns:

  • Provenance markers: Each agent's output includes a marker verified by the orchestrator
  • Command injection: Findings recommending dangerous commands are flagged
  • Role reassignment: Attempts to redefine the agent's role are detected
  • Instruction override: Embedded "ignore previous instructions" patterns are caught

Detection logic is in _injection-check.sh, shared by both validators.

Mediated Communication

File: protocols/mediated-communication.md

Agents never communicate directly. All inter-agent exchange goes through the orchestrator:

  1. Agent A produces findings
  2. Orchestrator sanitizes output (strips provenance markers, validates structure)
  3. Orchestrator routes relevant findings to Agent B for challenge
  4. Agent B produces challenge response
  5. Orchestrator routes defense back to Agent A
  6. Process repeats until convergence or iteration cap

This prevents agents from manipulating each other's behavior through crafted messages.

Convergence Detection

File: protocols/convergence-detection.md

Specialists self-refine through multiple iterations. Convergence detection determines when to stop:

  • Compare finding IDs, severities, and key evidence between iterations
  • If the delta is below threshold, the specialist has converged
  • Maximum iterations are capped (2 default, 3 for thorough mode)
  • Convergence is checked via detect-convergence.sh

Delta Mode

File: protocols/delta-mode.md

When --delta is active, only changed code is reviewed:

  • Changed files are identified from git diff
  • Unchanged files are excluded from specialist input
  • Existing findings on unchanged code are preserved
  • New findings are generated only for changed code

Token Budget

File: protocols/token-budget.md

Budget management ensures reviews stay within cost limits:

  • Total budget initialized at review start (default 350K, configurable)
  • Per-agent budget cap: 150% of fair share (total / num_agents)
  • Budget tracked after each phase via track-budget.sh
  • Review stops early if budget is exhausted
  • Budget summary included in final report

Guardrails

File: protocols/guardrails.md

Guardrails enforce behavioral constraints:

Guardrail Enforcement
Scope confinement Findings outside target demoted (or rejected with --strict-scope)
Iteration hard cap MAX_ITERATIONS constant (cannot be overridden by agents)
Budget enforcement Programmatic stop, not advisory
Per-agent budget 150% fair share cap
Evidence threshold Findings with < 100 chars evidence auto-demoted
Destructive pattern check Regex scan of recommended fixes
Severity inflation Warning when > 50% of agent's findings are Critical

Domain-Aware Challenge Routing

Behavior: During the challenge round (Phase 2), findings are routed to specialists based on domain affinity. Each specialist has primary and adjacent domains:

Code profile:

Specialist Primary Adjacent
SEC injection, auth, crypto, secrets input-validation, error-handling
PERF complexity, memory, io, caching concurrency, scalability
QUAL naming, duplication, solid, error-handling readability, testing
CORR logic, edge-cases, races, invariants error-propagation, null-safety
ARCH coupling, cohesion, boundaries, extensibility patterns, dependencies

Strategy profile:

Specialist Primary Adjacent
FEAS effort, dependencies, phasing, technical-risk timeline, resources
ARCH integration, api-contracts, boundaries, failure-modes patterns, scalability
SEC threat-model, auth, data-handling, compliance crypto, network
USER backward-compat, migration, usability, documentation api-design, ux
SCOP scope, acceptance-criteria, completeness, nfr edge-cases, priorities
TEST testability, coverage, test-strategy, verification ci-cd, environments

Routing is advisory: specialists can still challenge any finding, but the hint saves 40-60% of cross-agent token consumption by guiding attention to relevant findings first.

Domain-Scoped Voting Pools

Behavior: During resolution (Phase 3), each finding's voting pool is computed from actual Phase 2 behavior rather than the global specialist count. This prevents legitimate domain abstentions from breaking quorum.

Pool membership: - The finding's originator is always in the pool (implicit Agree) - Any specialist who chose Agree or Challenge is in the pool - Specialists who Abstained are excluded from the pool

N_effective (pool size) replaces the global N for computing quorum and strict majority thresholds per-finding. When N_effective < N, findings are labeled with the pool size (e.g., "Consensus (3/5)"). If N_effective = 1, single-specialist resolution rules apply.

This design is symmetric: both Agree and Challenge count as opt-in. No asymmetric vote manipulation is possible.

Audit Log

File: protocols/audit-log.md

Session logging for reproducibility:

  • Review parameters (target, flags, specialists, budget)
  • Phase timestamps and token consumption
  • Finding lifecycle (created, challenged, defended, dismissed, validated)
  • Agreement classifications
  • Script execution results