Adding Extractors¶
This guide covers adding a new architecture extractor to the analyzer.
Extractor types¶
There are three categories:
- YAML extractors: Parse Kubernetes manifests (most extractors)
- Go source extractors: Parse Go code with tree-sitter
- File extractors: Parse specific file formats (Dockerfiles, Helm, go.mod)
Step 1: Define the data type¶
Add your extracted data type to pkg/extractor/types.go:
type MyResource struct {
Name string `json:"name"`
Namespace string `json:"namespace,omitempty"`
Properties []string `json:"properties"`
SourceFile string `json:"source_file"`
}
Add the field to ComponentArchitecture:
type ComponentArchitecture struct {
// ... existing fields
MyResources []MyResource `json:"my_resources,omitempty"`
}
Step 2: Write the extractor¶
Create pkg/extractor/my_resource.go:
package extractor
import (
"path/filepath"
)
func extractMyResources(repoPath string) ([]MyResource, error) {
var results []MyResource
// Find relevant files
files, err := findYAMLFiles(repoPath, "**/my-resource*.yaml")
if err != nil {
return nil, err
}
for _, file := range files {
// Parse YAML
resources, err := parseMyResourceYAML(file)
if err != nil {
// Log warning, don't fail
continue
}
relPath, _ := filepath.Rel(repoPath, file)
for _, r := range resources {
r.SourceFile = relPath
results = append(results, r)
}
}
return results, nil
}
Step 3: Register in ExtractAll¶
Add your extractor to pkg/extractor/extract.go:
func ExtractAll(repoPath string) (*ComponentArchitecture, error) {
arch := &ComponentArchitecture{}
// ... existing extractors
// My resources
myResources, err := extractMyResources(repoPath)
if err != nil {
log.Printf("WARNING: my-resource extraction failed: %v", err)
}
arch.MyResources = myResources
return arch, nil
}
Resilience
Extractors should log warnings on failure, not return errors. A failed extractor should not prevent other extractors from running.
Step 4: Add a renderer (optional)¶
If your data needs visualization, create pkg/renderer/my_resource.go:
func renderMyResources(arch *ComponentArchitecture) string {
// Build Mermaid diagram, markdown table, etc.
}
Register in pkg/renderer/renderer.go.
Step 5: Add tests¶
Create pkg/extractor/my_resource_test.go with test fixtures in testdata/.
Test at minimum:
- Valid YAML parsing
- Missing/empty files (should not error)
- Malformed YAML (should skip, not crash)
- Relative path calculation
Guidelines¶
- Read-only: Never modify files in the repository
- Resilient: Skip bad files, don't crash
- Source-traceable: Every extracted fact must include
source_file - No secrets: Never extract secret values (only names and references)
- Minimal dependencies: Prefer parsing YAML/Go directly over importing heavy libraries
YAML file discovery¶
Use the existing findYAMLFiles helper in pkg/extractor/yaml.go:
Supports glob patterns and walks the repository filesystem.
Go source extraction¶
For extracting information from Go source code, use the tree-sitter parser in pkg/parser/:
parser := parser.NewGoParser()
result, err := parser.ParseFile(filePath)
// result.Functions, result.Calls, result.StructLiterals, etc.
See the controller watches extractor (controller_watches.go) for a complete example of Go source extraction.