Domains¶
The domain analysis framework provides pluggable analysis capabilities on top of the code property graph.
Registered Domains¶
| Domain | Languages | Dependencies | Queries |
|---|---|---|---|
security |
Go, Python, TypeScript, Rust | None | 12 rules (CGA-003 to CGA-014) |
testing |
Go | security | 4 rules (CGA-T01 to CGA-T04) |
upgrade |
Go | None | 4 rules (CGA-U01 to CGA-U04) |
architecture |
Python | None | 4 rules (CGA-A01 to CGA-A04) |
netpolicy |
Go, Python | None | 2 rules (CGA-N01 to CGA-N02) |
Additionally, the base query engine provides 2 core queries (CGA-001, CGA-002) that run independently of domains.
Domain Architecture¶
Each domain consists of three components:
flowchart LR
CPG["Code Property\nGraph"] --> ANN["Annotator\n(marks nodes)"]
ANN --> QUERY["Queries\n(traverse graph)"]
QUERY --> FIND["Findings\n(with file:line)"]
classDef domain fill:#9b59b6,stroke:#8e44ad,color:#fff
class ANN,QUERY domain
- Annotator: Walks CPG nodes and adds domain-specific metadata (annotations). Each language has its own annotator (e.g.,
go_annotator.go,python_annotator.go). - Queries: Traverse the annotated graph looking for patterns. Queries can use typed node fields (complexity, trust level), edge confidence, and annotations.
- Findings: Results with file:line references, severity, and optional architecture cross-references.
Base Queries¶
These run independently of domains via the query engine (pkg/query/engine.go):
| Query ID | Name | Severity | Description |
|---|---|---|---|
| CGA-001 | Missing Auth | High | HTTP handlers or entrypoints without authentication middleware |
| CGA-002 | Cross-Storage Taint | High | Tainted data flowing from user input to storage sinks |
Security Domain¶
Annotator¶
Multi-language annotators (go_annotator.go, python_annotator.go, typescript_annotator.go, rust_annotator.go) mark CPG nodes with security-relevant metadata:
- Source annotations:
handles_user_input,sec:handles_request,sec:deserializes_input - Sink annotations:
sec:executes_sql,sec:subprocess_call,sec:command_execution,sec:renders_html,sec:template_render,sec:file_access,sec:eval_usage - Auth annotations: Authentication/authorization function calls
- Secret annotations: Potential hardcoded credentials
Queries¶
| Query ID | Name | Severity | Description |
|---|---|---|---|
| CGA-003 | webhook-missing-update | High | Webhooks intercepting CREATE but not UPDATE operations |
| CGA-004 | rbac-precedence-bug | High | Conflicting RBAC rules across bindings |
| CGA-005 | cert-as-ca | High | Certificate used as CA without proper validation |
| CGA-006 | cross-namespace-secret | High | Secret access crossing namespace boundaries |
| CGA-007 | unfiltered-cache | Medium | Watched types without cache filters (OOM risk) |
| CGA-008 | plaintext-secrets | Medium | Hardcoded secrets or credentials in source |
| CGA-009 | weak-serial-entropy | Medium | Weak randomness in security-sensitive contexts |
| CGA-010 | complexity-hotspot | Medium | High-complexity functions with security annotations |
| CGA-011 | untrusted-endpoint | Info | HTTP endpoints without recognized auth middleware |
| CGA-012 | unprotected-ingress | High | Ingress routes without TLS or auth configuration |
| CGA-013 | overprivileged-secret-access | Medium | Broad secret access beyond operational needs |
| CGA-014 | uncontrolled-egress | Medium | Outbound connections without network policy |
Testing Domain¶
Annotator¶
Marks nodes with testing metadata:
- Test function detection (
Test*,Benchmark*) - Mock/fake usage patterns (interface mocks, fake clients)
- Table-driven test detection
Queries¶
| Query ID | Name | Severity | Description |
|---|---|---|---|
| CGA-T01 | untested-security-func | Medium | Security-annotated functions without corresponding test functions |
| CGA-T02 | fake-only-integration | Low | Integration tests using only fakes/mocks, no real dependencies |
| CGA-T03 | missing-error-paths | Medium | Error return paths without test coverage |
| CGA-T04 | consolidation-opportunity | Low | Duplicate test patterns that could be consolidated |
Upgrade Domain¶
Annotator¶
Marks nodes with upgrade-relevant metadata:
- Deprecated API version usage (v1beta1, v1alpha1)
- Feature gate references
- Version-dependent code paths
Queries¶
| Query ID | Name | Severity | Description |
|---|---|---|---|
| CGA-U01 | unconverted-crd | Medium | CRDs still using v1beta1 when v1 is available |
| CGA-U02 | pre-release-api-usage | Low | Usage of alpha/beta Kubernetes APIs |
| CGA-U03 | ungated-feature | Medium | Features without feature gate protection |
| CGA-U04 | unchecked-version-access | Low | Version-dependent code without version checks |
Architecture Domain¶
Annotator¶
Python annotator marks CPG nodes with structural metadata:
- Abstract bases: Classes inheriting from
ABCor withBase/Abstractname prefix - Implementations: Classes with non-empty
BaseClasses(inheritance chain) - Factory methods: Functions containing 2+ class instantiations via data flow edges
- SDK clients: Call sites matching known SDK patterns (openai, boto3, chromadb, elasticsearch, etc.)
Queries¶
| Query ID | Name | Severity | Description |
|---|---|---|---|
| CGA-A01 | abstraction-layers | Info | Surfaces class hierarchies with abstract bases and their implementations |
| CGA-A02 | external-api-surface | Info | Functions using external SDK clients (openai, boto3, chromadb, etc.) |
| CGA-A03 | factory-dispatch | Info | Factory functions dispatching to multiple implementation types |
| CGA-A04 | unimplemented-interface | Low | Abstract bases with no implementations found in analyzed sources |
Example output (feast)¶
CGA-A01: Abstraction layer: OfflineStore has 16 implementations: BigQueryOfflineStore, ...
CGA-A01: Abstraction layer: ComputeEngine has 6 implementations: Lambda, K8s, Local, Ray, ...
CGA-A02: Function _get_client calls external SDK: Elasticsearch
CGA-A02: Function _connect calls external SDK: MilvusClient
CGA-A03: Factory function get_online_store dispatches to: RedisOnlineStore, DynamoDBOnlineStore
CGA-A04: Abstract base StreamProcessor has no implementations found in analyzed sources
Network Policy Domain¶
How it works¶
The netpolicy domain traces NetworkPolicy trust chains by combining YAML-extracted and programmatically-created (Go source) NetworkPolicies. It identifies:
- Bare namespaceSelectors: NetworkPolicies that allow ingress from matching namespaces without podSelector or port restrictions
- Tenant namespace reach: namespaceSelector labels applied to namespaces running tenant workloads (notebooks, pipelines, model serving) that can reach control plane services
Queries¶
| Query ID | Name | Severity | Description |
|---|---|---|---|
| CGA-N01 | netpol-bare-namespace-selector | High | NetworkPolicy allows ingress via namespaceSelector without podSelector or port restrictions |
| CGA-N02 | netpol-tenant-reach | High | Tenant workload namespaces can reach control plane services |
Example output (opendatahub-operator)¶
CGA-N01: NetworkPolicy "applications-namespace" allows ingress from namespaces matching
labels.ODH.OwnedNamespace=labels.True without podSelector or port restrictions.
(internal/controller/dscinitialization/utils.go:160)
CGA-N02: NetworkPolicy "security-dashboard" allows ingress from tenant namespace.
Restriction: ports. Tenant workloads can reach control plane services.
Using Domains¶
# List registered domains
arch-analyzer domains
# Run all domains
arch-analyzer scan /path/to/repo
# Run specific domains
arch-analyzer scan /path/to/repo --domains security,testing
# With architecture enrichment
arch-analyzer scan /path/to/repo --domains security --with-arch
# With external SARIF findings
arch-analyzer scan /path/to/repo --import-sarif gosec.sarif,semgrep.sarif
Domain Orchestrator¶
The orchestrator (pkg/domains/orchestrator.go) handles domain execution:
- Reads domain dependencies via
Dependencies()method - Performs topological sort to determine execution order (e.g., testing depends on security)
- Runs annotators in dependency order
- Runs taint analysis (if data flow edges are present)
- Runs queries after all annotators complete
- Collects findings grouped by domain
Domains without dependencies can run their annotators in parallel.
Adding a Custom Domain¶
A domain needs:
analyzer.go: ImplementsDomainAnalyzerinterfaceannotations.go: Defines annotation types- Language-specific annotators (e.g.,
go_annotator.go) queries.go: Implements query functions returning[]query.Finding- Registration in
pkg/domains/registry.go