44 KiB
Design Document: IaC Reverse Engineering
Overview
This design describes a CLI tool that reverse-engineers existing on-premises infrastructure into well-structured Terraform HCL code and state files. The tool connects to on-premises platform APIs (Docker Swarm, Kubernetes, Synology Disk Station, SUSE Harvester, Windows machines, and bare metal servers), discovers deployed resources, resolves inter-resource dependencies, generates idiomatic Terraform code organized by resource type, and produces a valid state file so Terraform recognizes existing resources without attempting recreation.
The tool is designed exclusively for on-premises environments — no cloud provider support is included. It handles the unique characteristics of different platform types: container orchestration (Docker Swarm, Kubernetes), storage appliances (Synology), HCI (SUSE Harvester), Windows machines, and bare metal servers. All resources are tracked with CPU architecture awareness (ARM, AMD64, AArch64) to support heterogeneous infrastructure environments.
The target infrastructure consists of:
- Raspberry Pi cluster (ARM/AArch64) — Kubernetes and Docker Swarm nodes for container orchestration
- Dell PowerEdge servers (AMD64) — SUSE Harvester HCI nodes providing virtualization and storage
- Synology NAS — Storage appliance for shared storage, backups, and media
- Standalone Windows machines — Running various services (IIS, scheduled tasks, Hyper-V VMs)
- Authentik — Identity provider for SSO across all managed infrastructure
Authentication and SSO are handled through Authentik, which serves as the identity provider for the tool itself and is also discoverable as managed infrastructure (Authentik configurations, flows, providers, and applications can be reverse-engineered into IaC).
The tool is implemented in Python for its rich ecosystem of infrastructure libraries (kubernetes-client, docker-sdk, pywinrm), rapid development cycle, and strong support for graph algorithms. It follows a pipeline architecture where each stage transforms data from the previous stage, enabling clear separation of concerns and independent testability.
Key Design Decisions
- Python as implementation language — Rich infrastructure SDK ecosystem (kubernetes-client, docker-sdk-python, pywinrm for Windows, python-synology), strong graph libraries (networkx), and Jinja2 for HCL templating.
- Pipeline architecture — Each component (Scanner → Dependency Resolver → Code Generator → State Builder) operates on well-defined data structures, enabling independent testing and extension.
- Provider plugin system — Each on-premises platform is implemented as a plugin conforming to a common interface, making it straightforward to add new platforms.
- Platform type categorization — Providers are categorized by platform type (container orchestration, storage appliance, HCI, windows, bare metal) to handle their distinct resource models and discovery patterns.
- CPU architecture tracking — Every discovered resource carries architecture metadata (ARM, AMD64, AArch64) enabling architecture-aware code generation and resource organization.
- Authentik as identity provider — The tool authenticates users via Authentik SSO, and Authentik itself is a discoverable infrastructure target whose configurations can be reverse-engineered into IaC.
- Terraform state format v4 — Direct JSON generation of state files rather than relying on
terraform importfor each resource, enabling bulk operations. - Incremental scan via snapshot diffing — Store scan results as timestamped JSON snapshots and compute diffs for incremental updates.
- Windows discovery via WinRM/WMI — Uses pywinrm library to connect to Windows machines and discover services, scheduled tasks, IIS sites, network configuration, installed software, Windows features, and Hyper-V VMs.
Architecture
The system follows a staged pipeline architecture with clear data flow between components:
graph TD
A[Scan Profile Config] --> B[Scanner]
B --> C[Resource Inventory]
C --> D[Dependency Resolver]
D --> E[Dependency Graph]
E --> F[Code Generator]
F --> G[HCL Files]
E --> H[State Builder]
H --> I[State File]
G --> J[Validator]
I --> J
J --> K[Validation Report]
subgraph "Provider Plugins"
B --> P3[Docker Swarm Plugin]
B --> P4[Kubernetes Plugin]
B --> P5[Synology Plugin]
B --> P6[Harvester Plugin]
B --> P7[Bare Metal Plugin]
B --> P8[Windows Plugin]
end
subgraph "Authentication"
AU[Authentik SSO] --> B
AU --> AUD[Authentik Discovery Plugin]
end
subgraph "Incremental Scan"
L[Previous Snapshot] --> M[Diff Engine]
C --> M
M --> N[Change Set]
end
Component Interaction Flow
sequenceDiagram
participant User
participant Authentik
participant CLI
participant Scanner
participant DependencyResolver
participant CodeGenerator
participant StateBuilder
participant Validator
User->>Authentik: Authenticate via SSO
Authentik-->>CLI: OAuth2/OIDC token
User->>CLI: Provide Scan Profile
CLI->>Scanner: Start discovery
Scanner->>Scanner: Connect to platform API
Scanner->>Scanner: Enumerate resources
Scanner->>Scanner: Detect CPU architecture
Scanner-->>CLI: Progress updates
Scanner->>DependencyResolver: Resource Inventory
DependencyResolver->>DependencyResolver: Build dependency graph
DependencyResolver->>DependencyResolver: Detect cycles
DependencyResolver->>CodeGenerator: Dependency Graph
CodeGenerator->>CodeGenerator: Generate HCL files
CodeGenerator->>CodeGenerator: Extract variables
CodeGenerator->>CodeGenerator: Apply architecture tags
CodeGenerator->>StateBuilder: Resource mappings
StateBuilder->>StateBuilder: Build state entries
StateBuilder->>Validator: State file
CodeGenerator->>Validator: HCL files
Validator->>Validator: terraform init/validate/plan
Validator-->>User: Validation report
Components and Interfaces
1. Scanner
The Scanner is responsible for connecting to on-premises platform APIs and discovering resources. Each platform type has distinct discovery patterns.
from abc import ABC, abstractmethod
from dataclasses import dataclass, field
from typing import Optional
from enum import Enum
class ProviderType(Enum):
DOCKER_SWARM = "docker_swarm"
KUBERNETES = "kubernetes"
SYNOLOGY = "synology"
HARVESTER = "harvester"
BARE_METAL = "bare_metal"
WINDOWS = "windows"
class PlatformCategory(Enum):
"""Categorizes providers by their infrastructure model."""
CONTAINER_ORCHESTRATION = "container" # Docker Swarm, Kubernetes
STORAGE_APPLIANCE = "storage" # Synology Disk Station
HCI = "hci" # SUSE Harvester (Hyper-Converged Infrastructure)
BARE_METAL = "bare_metal" # Physical servers (Linux)
WINDOWS = "windows" # Standalone Windows machines
PROVIDER_PLATFORM_MAP: dict[ProviderType, PlatformCategory] = {
ProviderType.DOCKER_SWARM: PlatformCategory.CONTAINER_ORCHESTRATION,
ProviderType.KUBERNETES: PlatformCategory.CONTAINER_ORCHESTRATION,
ProviderType.SYNOLOGY: PlatformCategory.STORAGE_APPLIANCE,
ProviderType.HARVESTER: PlatformCategory.HCI,
ProviderType.BARE_METAL: PlatformCategory.BARE_METAL,
ProviderType.WINDOWS: PlatformCategory.WINDOWS,
}
class CpuArchitecture(Enum):
"""CPU architecture of the host or resource."""
AMD64 = "amd64"
ARM = "arm"
AARCH64 = "aarch64"
@dataclass
class ScanProfile:
provider: ProviderType
credentials: dict[str, str] # Provider-specific auth (API tokens, usernames, etc.)
endpoints: Optional[list[str]] = None # API endpoints / host addresses
resource_type_filters: Optional[list[str]] = None # None means all types
authentik_token: Optional[str] = None # SSO token from Authentik
def validate(self) -> list[str]:
"""Returns list of validation errors, empty if valid."""
...
@property
def platform_category(self) -> PlatformCategory:
return PROVIDER_PLATFORM_MAP[self.provider]
@dataclass
class DiscoveredResource:
resource_type: str # e.g., "kubernetes_deployment", "windows_iis_site"
unique_id: str # Provider-assigned unique identifier
name: str # Human-readable name or tag
provider: ProviderType
platform_category: PlatformCategory
architecture: CpuArchitecture # CPU architecture of the resource/host
endpoint: str # Which API endpoint this was discovered from
attributes: dict # Full configuration attributes
raw_references: list[str] # IDs referenced by this resource (pre-resolution)
@dataclass
class ScanResult:
resources: list[DiscoveredResource]
warnings: list[str]
errors: list[str]
scan_timestamp: str
profile_hash: str # Hash of scan profile for matching incremental scans
is_partial: bool = False # True if scan was interrupted
@dataclass
class ScanProgress:
current_resource_type: str
resources_discovered: int
resource_types_completed: int
total_resource_types: int
class ProviderPlugin(ABC):
"""Interface that all provider plugins must implement."""
@abstractmethod
def authenticate(self, credentials: dict[str, str]) -> None:
"""Authenticate with the platform API. Raises AuthenticationError on failure."""
...
@abstractmethod
def get_platform_category(self) -> PlatformCategory:
"""Return the platform category for this provider."""
...
@abstractmethod
def list_endpoints(self) -> list[str]:
"""Return all reachable endpoints/hosts for this provider."""
...
@abstractmethod
def list_supported_resource_types(self) -> list[str]:
"""Return all resource types this plugin can discover."""
...
@abstractmethod
def detect_architecture(self, endpoint: str) -> CpuArchitecture:
"""Detect the CPU architecture of the target host/node."""
...
@abstractmethod
def discover_resources(
self,
endpoints: list[str],
resource_types: list[str],
progress_callback: callable
) -> ScanResult:
"""Discover resources. Calls progress_callback with ScanProgress updates."""
...
2. Windows Provider Plugin
The Windows plugin discovers infrastructure on standalone Windows machines via WinRM/WMI using the pywinrm library.
class WindowsDiscoveryPlugin(ProviderPlugin):
"""Discovers Windows machine configurations via WinRM/WMI.
Discovers: Windows services, scheduled tasks, IIS sites/app pools,
network configuration, installed software, Windows features,
and Hyper-V VMs (if Hyper-V role is present).
Uses pywinrm for connectivity and WMI/CIM queries for discovery.
"""
def get_platform_category(self) -> PlatformCategory:
return PlatformCategory.WINDOWS
def list_supported_resource_types(self) -> list[str]:
return [
"windows_service",
"windows_scheduled_task",
"windows_iis_site",
"windows_iis_app_pool",
"windows_network_adapter",
"windows_firewall_rule",
"windows_installed_software",
"windows_feature",
"windows_hyperv_vm",
"windows_hyperv_switch",
"windows_dns_record",
"windows_local_user",
"windows_local_group",
]
def authenticate(self, credentials: dict[str, str]) -> None:
"""Authenticate via WinRM using NTLM or Kerberos.
Expected credentials:
- host: Target Windows machine hostname/IP
- username: Windows username (DOMAIN\\user or user@domain)
- password: Windows password
- transport: "ntlm" (default) or "kerberos"
- port: WinRM port (default 5985 for HTTP, 5986 for HTTPS)
- use_ssl: "true" or "false" (default "true")
"""
...
def detect_architecture(self, endpoint: str) -> CpuArchitecture:
"""Detect architecture via WMI Win32_Processor query."""
...
def discover_resources(
self,
endpoints: list[str],
resource_types: list[str],
progress_callback: callable
) -> ScanResult:
"""Discover Windows resources via WinRM/WMI queries.
Uses CIM sessions for efficient bulk queries.
Discovers Hyper-V resources only if the Hyper-V role is installed.
"""
...
3. Authentik Integration
Authentik serves dual roles: authenticating users of the tool via SSO, and being a discoverable infrastructure target.
@dataclass
class AuthentikConfig:
base_url: str # Authentik instance URL
client_id: str # OAuth2 client ID for this tool
client_secret: str # OAuth2 client secret
@dataclass
class AuthentikSession:
access_token: str
refresh_token: str
user_id: str
groups: list[str]
class AuthentikAuthProvider:
"""Handles SSO authentication for the tool itself."""
def authenticate_user(self, config: AuthentikConfig) -> AuthentikSession:
"""Initiate OAuth2/OIDC flow with Authentik. Returns session on success."""
...
def refresh_session(self, session: AuthentikSession) -> AuthentikSession:
"""Refresh an expired session token."""
...
def validate_token(self, token: str) -> bool:
"""Validate an existing token is still valid."""
...
class AuthentikDiscoveryPlugin(ProviderPlugin):
"""Discovers Authentik configurations as infrastructure resources.
Discovers: flows, stages, providers, applications, outposts,
property mappings, certificates, and SSO integrations with
other managed platforms.
"""
def list_supported_resource_types(self) -> list[str]:
return [
"authentik_flow",
"authentik_stage",
"authentik_provider",
"authentik_application",
"authentik_outpost",
"authentik_property_mapping",
"authentik_certificate",
"authentik_group",
"authentik_source",
]
...
4. Dependency Resolver
Analyzes resource relationships and produces a topological ordering.
@dataclass
class ResourceRelationship:
source_id: str # Resource that holds the reference
target_id: str # Resource being referenced
relationship_type: str # "parent-child", "reference", "dependency"
source_attribute: str # Attribute in source that holds the reference
@dataclass
class DependencyGraph:
resources: list[DiscoveredResource]
relationships: list[ResourceRelationship]
topological_order: list[str] # Resource IDs in dependency order
cycles: list[list[str]] # Detected cycles (list of resource ID chains)
unresolved_references: list[UnresolvedReference]
@dataclass
class UnresolvedReference:
source_resource_id: str
source_attribute: str
referenced_id: str # The ID that couldn't be resolved
suggested_resolution: str # "data_source" or "variable"
class DependencyResolverInterface:
def resolve(self, inventory: ScanResult) -> DependencyGraph:
"""Analyze relationships and produce dependency graph."""
...
def detect_cycles(self, graph: DependencyGraph) -> list[CycleReport]:
"""Detect and report circular dependencies with resolution suggestions."""
...
5. Code Generator
Produces Terraform HCL files from the dependency graph. Architecture-aware: generates architecture tags and organizes resources by platform category.
@dataclass
class GeneratedFile:
filename: str # e.g., "kubernetes_deployment.tf", "windows_service.tf"
content: str # HCL content
resource_count: int
@dataclass
class ExtractedVariable:
name: str # Variable name
type_expr: str # Terraform type expression
default_value: str # Most common value
description: str
used_by: list[str] # Resource IDs using this variable
@dataclass
class CodeGenerationResult:
resource_files: list[GeneratedFile]
variables_file: GeneratedFile
provider_file: GeneratedFile
outputs_file: Optional[GeneratedFile]
skipped_resources: list[tuple[str, str]] # (resource_id, reason)
class CodeGeneratorInterface:
def generate(self, graph: DependencyGraph, profiles: list[ScanProfile]) -> CodeGenerationResult:
"""Generate Terraform HCL from dependency graph.
Architecture-aware: includes architecture tags/labels on resources,
organizes provider blocks by platform category.
"""
...
def sanitize_identifier(self, name: str) -> str:
"""Convert resource name to valid Terraform identifier."""
...
def extract_variables(self, resources: list[DiscoveredResource]) -> list[ExtractedVariable]:
"""Identify common values to extract as variables."""
...
def generate_architecture_tags(self, resource: DiscoveredResource) -> dict[str, str]:
"""Generate architecture-specific tags/labels for a resource."""
...
6. State Builder
Generates Terraform state file (format version 4).
@dataclass
class StateEntry:
resource_type: str
resource_name: str # Terraform identifier name
provider_id: str # Provider-assigned unique ID
attributes: dict # Full attribute set
sensitive_attributes: list[str]
schema_version: int
dependencies: list[str] # Terraform resource addresses of dependencies
@dataclass
class StateFile:
version: int = 4
terraform_version: str = ""
serial: int = 1
lineage: str = "" # UUID
resources: list[StateEntry] = field(default_factory=list)
def to_json(self) -> str:
"""Serialize to Terraform state JSON format."""
...
class StateBuilderInterface:
def build(self, code_result: CodeGenerationResult, graph: DependencyGraph, provider_version: str) -> StateFile:
"""Build state file from generated code and dependency graph."""
...
7. Validator
Runs Terraform commands to validate generated output.
@dataclass
class ValidationResult:
init_success: bool
validate_success: bool
plan_success: bool
planned_changes: list[PlannedChange]
errors: list[ValidationError]
correction_attempts: int
@dataclass
class PlannedChange:
resource_address: str
change_type: str # "add", "modify", "destroy"
details: str
@dataclass
class ValidationError:
file: str
message: str
line: Optional[int] = None
class ValidatorInterface:
def validate(self, output_dir: str, max_correction_attempts: int = 3) -> ValidationResult:
"""Run terraform init, validate, and plan. Attempt corrections if needed."""
...
8. Incremental Scan Engine
Compares current scan results against previous snapshots.
class ChangeType(Enum):
ADDED = "added"
REMOVED = "removed"
MODIFIED = "modified"
@dataclass
class ResourceChange:
resource_id: str
resource_type: str
resource_name: str
change_type: ChangeType
changed_attributes: Optional[dict] = None # For MODIFIED, old->new values
@dataclass
class ChangeSummary:
added_count: int
removed_count: int
modified_count: int
changes: list[ResourceChange]
class IncrementalScanEngine:
def compare(self, current: ScanResult, previous: ScanResult) -> ChangeSummary:
"""Compare two scan results and classify changes."""
...
def store_snapshot(self, result: ScanResult, profile_hash: str) -> None:
"""Persist scan result for future comparison."""
...
def load_previous(self, profile_hash: str) -> Optional[ScanResult]:
"""Load most recent previous scan for this profile."""
...
Data Models
Platform Type Differentiation
Each provider type maps to a platform category that determines discovery patterns:
| Platform Category | Providers | Resource Model | Discovery Pattern |
|---|---|---|---|
| Container Orchestration | Docker Swarm, Kubernetes | Services, deployments, pods, volumes, networks, configs | Docker/K8s API listing of workloads, services, and cluster resources |
| Storage Appliance | Synology Disk Station | Volumes, shares, pools, replication tasks, users | Synology DSM API for storage pools, shared folders, packages |
| HCI | SUSE Harvester | VMs, volumes, images, networks (combines hypervisor + storage) | Harvester/K8s-based API for VM and storage resources |
| Bare Metal | Physical servers (Linux) | Hardware inventory, IPMI/BMC configs, network interfaces, RAID | IPMI/Redfish API for hardware discovery, network config |
| Windows | Standalone Windows machines | Services, scheduled tasks, IIS sites, network config, software, features, Hyper-V VMs | WinRM/WMI queries via pywinrm for system configuration discovery |
CPU Architecture Model
Architecture is tracked at the host/node level and inherited by resources running on that host:
| Architecture | Description | Common Platforms |
|---|---|---|
| AMD64 | x86-64 / Intel 64 | Dell PowerEdge servers (Harvester nodes), Windows machines |
| ARM | 32-bit ARM | Older embedded devices, some Synology NAS models |
| AArch64 | 64-bit ARM (ARMv8+) | Raspberry Pi cluster nodes (K8s/Docker Swarm), some Synology models |
Scan Profile Configuration (YAML)
# scan_profile.yaml - Kubernetes example (Raspberry Pi cluster)
provider: kubernetes
credentials:
kubeconfig_path: "${HOME}/.kube/config"
context: "pi-cluster"
endpoints:
- "https://k8s-api.internal.lab:6443"
resource_type_filters:
- kubernetes_deployment
- kubernetes_service
- kubernetes_ingress
- kubernetes_config_map
- kubernetes_persistent_volume
authentik:
base_url: "https://auth.internal.lab"
client_id: "iac-reverse-tool"
# scan_profile.yaml - Synology NAS example
provider: synology
credentials:
host: "nas01.internal.lab"
port: 5001
username: "${SYNOLOGY_USER}"
password: "${SYNOLOGY_PASSWORD}"
endpoints:
- "nas01.internal.lab:5001"
resource_type_filters:
- synology_shared_folder
- synology_volume
- synology_storage_pool
# scan_profile.yaml - Windows machine example
provider: windows
credentials:
host: "win-server-01.internal.lab"
username: "${WINDOWS_USER}"
password: "${WINDOWS_PASSWORD}"
transport: "ntlm"
use_ssl: "true"
port: "5986"
endpoints:
- "win-server-01.internal.lab"
resource_type_filters:
- windows_service
- windows_scheduled_task
- windows_iis_site
- windows_iis_app_pool
- windows_feature
- windows_hyperv_vm
# scan_profile.yaml - SUSE Harvester example (Dell PowerEdge)
provider: harvester
credentials:
kubeconfig_path: "${HOME}/.kube/harvester-config"
context: "harvester-cluster"
endpoints:
- "https://harvester.internal.lab:6443"
resource_type_filters:
- harvester_virtualmachine
- harvester_volume
- harvester_image
- harvester_network
Resource Inventory (Internal JSON)
{
"scan_timestamp": "2024-01-15T10:30:00Z",
"profile_hash": "a1b2c3d4",
"is_partial": false,
"resources": [
{
"resource_type": "kubernetes_deployment",
"unique_id": "apps/v1/deployments/default/nginx",
"name": "nginx",
"provider": "kubernetes",
"platform_category": "container",
"architecture": "aarch64",
"endpoint": "https://k8s-api.internal.lab:6443",
"attributes": {
"namespace": "default",
"replicas": 3,
"image": "nginx:1.25",
"node_selector": {"kubernetes.io/arch": "arm64"},
"labels": {"app": "nginx", "arch": "aarch64"}
},
"raw_references": ["default/services/nginx-svc"]
},
{
"resource_type": "windows_iis_site",
"unique_id": "win-server-01/iis/sites/Default Web Site",
"name": "Default Web Site",
"provider": "windows",
"platform_category": "windows",
"architecture": "amd64",
"endpoint": "win-server-01.internal.lab",
"attributes": {
"site_name": "Default Web Site",
"physical_path": "C:\\inetpub\\wwwroot",
"bindings": [
{"protocol": "https", "port": 443, "hostname": "app.internal.lab"}
],
"app_pool": "DefaultAppPool",
"state": "Started"
},
"raw_references": ["win-server-01/iis/app_pools/DefaultAppPool"]
},
{
"resource_type": "harvester_virtualmachine",
"unique_id": "harvester/vms/default/ubuntu-dev-01",
"name": "ubuntu-dev-01",
"provider": "harvester",
"platform_category": "hci",
"architecture": "amd64",
"endpoint": "https://harvester.internal.lab:6443",
"attributes": {
"namespace": "default",
"cpu": 4,
"memory": "8Gi",
"disk_size": "100Gi",
"network": "vlan-100",
"image": "ubuntu-22.04-server"
},
"raw_references": ["harvester/images/ubuntu-22.04-server", "harvester/networks/vlan-100"]
}
],
"warnings": [],
"errors": []
}
Terraform State File (Output Format v4)
{
"version": 4,
"terraform_version": "1.7.0",
"serial": 1,
"lineage": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"outputs": {},
"resources": [
{
"mode": "managed",
"type": "kubernetes_deployment",
"name": "nginx",
"provider": "provider[\"registry.terraform.io/hashicorp/kubernetes\"]",
"instances": [
{
"schema_version": 1,
"attributes": {
"id": "apps/v1/deployments/default/nginx",
"metadata": {
"name": "nginx",
"namespace": "default",
"labels": {"app": "nginx", "arch": "aarch64"}
},
"spec": {
"replicas": 3,
"template": {
"spec": {
"container": [{"image": "nginx:1.25"}],
"node_selector": {"kubernetes.io/arch": "arm64"}
}
}
}
},
"sensitive_attributes": [],
"dependencies": [
"kubernetes_service.nginx_svc"
]
}
]
}
]
}
Dependency Graph (Internal)
{
"nodes": ["apps/v1/deployments/default/nginx", "default/services/nginx-svc", "win-server-01/iis/sites/Default Web Site", "win-server-01/iis/app_pools/DefaultAppPool"],
"edges": [
{"source": "apps/v1/deployments/default/nginx", "target": "default/services/nginx-svc", "type": "reference", "attribute": "service_name"},
{"source": "win-server-01/iis/sites/Default Web Site", "target": "win-server-01/iis/app_pools/DefaultAppPool", "type": "dependency", "attribute": "app_pool"}
],
"topological_order": ["default/services/nginx-svc", "apps/v1/deployments/default/nginx", "win-server-01/iis/app_pools/DefaultAppPool", "win-server-01/iis/sites/Default Web Site"],
"cycles": [],
"unresolved_references": []
}
Scan Snapshot Storage
Snapshots are stored as JSON files in a .iac-reverse/snapshots/ directory:
.iac-reverse/
├── snapshots/
│ ├── a1b2c3d4_2024-01-15T10-30-00Z.json
│ └── a1b2c3d4_2024-01-14T09-00-00Z.json
└── config/
└── scan_profiles/
Correctness Properties
A property is a characteristic or behavior that should hold true across all valid executions of a system — essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.
Property 1: Resource inventory completeness
For any discovered resource from any on-premises provider (Docker Swarm, Kubernetes, Synology, Harvester, Bare Metal, Windows), the resulting inventory entry SHALL contain non-empty values for resource_type, unique_id, name, provider, platform_category, architecture, and attributes fields.
Validates: Requirements 1.2
Property 2: Authentication error descriptiveness
For any provider type and any authentication failure reason, the error returned by the Scanner SHALL contain both the provider name string and the failure reason string.
Validates: Requirements 1.3
Property 3: Graceful degradation on unsupported resource types
For any scan request containing a mix of supported and unsupported resource types, the Scanner SHALL produce warnings for each unsupported type AND return a complete inventory for all supported types (the presence of unsupported types does not reduce the discovered set of supported resources).
Validates: Requirements 1.4
Property 4: Progress reporting frequency
For any scan across N resource types, the progress callback SHALL be invoked at least N times, once per resource type completion, with monotonically increasing discovered resource counts.
Validates: Requirements 1.5
Property 5: Partial inventory preservation on failure
For any scan that is interrupted at an arbitrary point, the partial inventory SHALL contain exactly the set of resources that were successfully discovered before the failure point, with no duplicates and no resources from after the failure.
Validates: Requirements 1.7
Property 6: Dependency relationship identification
For any resource inventory where resource A's attributes contain resource B's unique identifier, the Dependency Resolver SHALL produce a relationship edge from A to B with the correct relationship type and source attribute.
Validates: Requirements 2.1
Property 7: Cycle detection correctness
For any dependency graph containing a cycle, the Dependency Resolver SHALL report the cycle listing all resources involved. For any acyclic dependency graph, the Dependency Resolver SHALL report zero cycles.
Validates: Requirements 2.3
Property 8: Topological order validity
For any acyclic dependency graph, the topological order produced by the Dependency Resolver SHALL satisfy the constraint that for every edge (A depends on B), B appears before A in the ordering.
Validates: Requirements 2.4
Property 9: Unresolved references become data sources or variables
For any resource that references an identifier not present in the current inventory, the generated output SHALL represent that reference as a data source lookup or variable — never as a hardcoded literal identifier string.
Validates: Requirements 2.5
Property 10: References in generated output use Terraform syntax
For any resource that references another resource present in the inventory, the generated HCL SHALL use Terraform resource reference expressions (e.g., kubernetes_service.name.id) rather than hardcoded identifier strings.
Validates: Requirements 2.2, 3.5
Property 11: Generated HCL syntactic validity
For any valid resource inventory and dependency graph, the Code Generator SHALL produce output that parses as syntactically valid HCL (no syntax errors when parsed by an HCL parser).
Validates: Requirements 3.1
Property 12: File organization by resource type
For any resource inventory containing resources of N distinct types, the Code Generator SHALL produce exactly N resource files, where each file contains only resource blocks of its designated type and every resource appears in exactly one file.
Validates: Requirements 3.2
Property 13: Variable extraction for shared values
For any attribute value that appears in 2 or more resources in the inventory, the Code Generator SHALL extract that value into a Terraform variable with a default set to the most commonly occurring value.
Validates: Requirements 3.3
Property 14: Identifier sanitization validity
For any input string (including strings with special characters, unicode, leading digits, or spaces), the sanitize_identifier function SHALL produce a non-empty string matching the regex ^[a-zA-Z_][a-zA-Z0-9_]*$.
Validates: Requirements 3.4
Property 15: Traceability comments in generated code
For any generated resource block, the output SHALL contain a comment including the original provider-assigned unique resource identifier for traceability.
Validates: Requirements 3.6
Property 16: State file structural validity
For any set of generated resources, the State Builder SHALL produce a JSON document with version=4, a valid UUID lineage, and a resources array where each entry has mode, type, name, provider, and instances fields conforming to Terraform state v4 schema.
Validates: Requirements 4.1
Property 17: State entry completeness and schema correctness
For any resource with a known provider schema version and known sensitive attributes, the state entry SHALL have schema_version matching the provider version, contain all discovered attributes, and mark exactly the sensitive attributes as sensitive.
Validates: Requirements 4.4, 4.5
Property 18: Multi-provider merge with naming conflict resolution
For any two or more resource inventories from different on-premises providers where resource names collide, the merged inventory SHALL contain all resources from all providers, with conflicting names prefixed by the provider identifier, and no resources lost.
Validates: Requirements 5.3
Property 19: Provider block generation
For any resource set spanning N distinct on-premises providers, the generated provider configuration SHALL contain exactly N provider blocks, one per distinct provider.
Validates: Requirements 5.4
Property 20: Scan profile validation completeness
For any scan profile with K invalid fields (missing provider, empty credentials, unreachable endpoints, filters exceeding 200 entries, or unsupported resource types), the validation error SHALL list all K invalid fields in a single response.
Validates: Requirements 6.1, 6.6, 6.7
Property 21: Filtering correctness
For any scan profile with resource type filters and/or endpoint filters, the discovered resources SHALL be a subset where every resource's type is in the filter list (if specified) AND every resource's endpoint is in the endpoint list (if specified). No resource outside the filter criteria shall appear.
Validates: Requirements 6.2, 6.4
Property 22: Drift report correctness
For any terraform plan output containing N planned changes, the drift report SHALL list exactly N entries, each with the correct resource address and change type (add, modify, or destroy).
Validates: Requirements 7.3
Property 23: Change classification correctness
For any pair of scan results (previous and current), every resource SHALL be classified exactly once as: added (in current but not previous), removed (in previous but not current), or modified (in both but with differing attributes). The summary counts SHALL equal the actual number of resources in each category.
Validates: Requirements 8.1, 8.5
Property 24: Incremental update scope
For any change set applied to existing IaC files, only files containing added, modified, or removed resources SHALL be modified. Files containing only unchanged resources SHALL remain identical.
Validates: Requirements 8.2
Property 25: Removed resource exclusion
For any resource classified as removed, the updated IaC output SHALL not contain a resource block for that resource, AND the updated state file SHALL not contain a state entry for that resource.
Validates: Requirements 8.3
Property 26: Snapshot retention
For any sequence of N scans (N ≥ 2) for the same Scan_Profile, at least the two most recent scan results SHALL be retained in storage after each scan completes.
Validates: Requirements 8.6
Error Handling
Error Categories
| Category | Examples | Handling Strategy |
|---|---|---|
| Authentication Failure | Invalid API tokens, expired credentials, Authentik SSO token expired, WinRM auth failure, insufficient permissions | Return descriptive error with provider name and reason. Do not retry. |
| Transient API Error | Rate limiting, timeout, temporary platform unavailability, WinRM connection timeout | Retry up to 3 times with exponential backoff. Log warning if all retries fail. |
| Connection Loss | Network partition, platform host unreachable, API endpoint down, WinRM session dropped | Return partial results with error indicating failure point. |
| Validation Error | Invalid scan profile, unsupported resource type, unreachable endpoint | Return all validation errors in a single response before attempting connection. |
| Generation Error | Unconvertible resource, missing attributes, unsupported architecture | Skip affected resource, log warning, continue with remaining resources. |
| External Tool Error | Terraform binary not found, terraform command failure | Report error with command name and failure details. |
| Authentik Error | SSO flow failure, token refresh failure, Authentik instance unreachable | Report authentication error, prompt re-authentication. |
| Windows-Specific Error | WinRM not enabled, WMI query failure, insufficient privileges, Hyper-V role not installed | Log warning for missing features, skip unavailable resource types, continue discovery. |
Error Propagation
graph TD
A[Platform API Error] -->|Transient| B[Retry up to 3x]
A -->|Permanent| C[Log warning, skip resource]
B -->|All retries fail| C
A -->|Connection lost| D[Return partial inventory]
E[Validation Error] --> F[Collect all errors]
F --> G[Return before execution]
H[Generation Error] --> I[Skip resource]
I --> J[Log warning with resource ID and reason]
J --> K[Continue generation]
L[Terraform Error] --> M{Correctable?}
M -->|Yes| N[Attempt correction, up to 3x]
M -->|No| O[Report to user]
N -->|Still failing| O
P[Authentik Error] --> Q{Token expired?}
Q -->|Yes| R[Attempt token refresh]
Q -->|No| S[Report auth failure]
R -->|Refresh fails| S
T[Windows Error] --> U{Feature missing?}
U -->|Yes| V[Skip resource type, log warning]
U -->|No| W[Retry or report]
On-Premises Connectivity Patterns
On-premises platforms have distinct connectivity characteristics compared to cloud APIs:
- Direct network access required — No public internet endpoints; the tool must have network connectivity to each platform's management interface (K8s API server, Synology DSM, Harvester dashboard, IPMI/BMC interfaces, WinRM endpoints).
- Self-signed certificates — Many on-prem platforms use self-signed TLS certificates. The tool must support configurable certificate verification (trust custom CA bundles or skip verification for known internal hosts).
- Varied authentication mechanisms — Each platform uses different auth: Kubernetes uses kubeconfig/service accounts, Synology uses session-based auth, Harvester uses K8s-style auth, bare metal uses IPMI credentials, Windows uses NTLM/Kerberos via WinRM.
- No rate limiting (typically) — On-prem APIs generally don't rate-limit, but may have connection limits or session caps.
- WinRM considerations — Windows machines require WinRM to be enabled and configured. The tool supports both HTTP (5985) and HTTPS (5986) transports, with NTLM or Kerberos authentication.
Retry Strategy
- Backoff: Exponential with jitter —
delay = min(base * 2^attempt + random_jitter, max_delay) - Base delay: 1 second
- Max delay: 30 seconds
- Max attempts: 3 per resource
- Idempotency: All discovery operations are read-only, safe to retry
- Connection timeout: 30 seconds per endpoint (configurable per platform)
- Certificate handling: Configurable per scan profile (verify, skip, or custom CA path)
- WinRM timeout: 60 seconds per operation (WMI queries can be slow on large systems)
Logging Levels
- ERROR: Authentication failures, connection loss, terraform binary missing, Authentik SSO failure, WinRM connection refused
- WARNING: Unsupported resource types, skipped resources, unmapped state entries, unresolved references, self-signed certificate warnings, Hyper-V role not installed
- INFO: Scan progress, resource counts, file generation, validation results, architecture detection, Windows feature availability
- DEBUG: Individual API calls, attribute mapping details, reference resolution steps, Authentik token lifecycle, WMI query details
Testing Strategy
Unit Tests
Unit tests cover specific examples, edge cases, and error conditions:
- Identifier sanitization: Specific edge cases (empty string, all-digits, unicode, reserved words)
- HCL template rendering: Specific resource types with known expected output (K8s deployments, Synology shares, Windows services, Harvester VMs)
- State file JSON structure: Specific entries with known expected format
- Error message formatting: Specific error scenarios with expected message content
- Configuration validation: Specific invalid profiles with expected error lists
- Architecture detection: Specific platform responses mapped to correct CpuArchitecture values
- Platform category mapping: Verify each provider maps to correct PlatformCategory
- Windows resource parsing: Specific WMI query results mapped to correct resource structures
- WinRM credential validation: Specific credential formats (NTLM, Kerberos) validated correctly
Property-Based Tests
Property-based tests verify universal properties across randomly generated inputs. This feature is well-suited to PBT because it involves:
- Pure data transformations (resource → HCL, resource → state entry)
- Graph algorithms (topological sort, cycle detection)
- String sanitization (arbitrary input → valid identifier)
- Set operations (filtering, diffing, merging)
Library: Hypothesis (Python PBT framework)
Configuration:
- Minimum 100 iterations per property test
- Each test tagged with:
Feature: iac-reverse-engineering, Property {number}: {property_text} - Custom strategies for generating:
DiscoveredResourceinstances with valid and edge-case attributes across all platform types- Resources with varying
CpuArchitecturevalues (AMD64, ARM, AArch64) - Dependency graphs (both acyclic and cyclic)
- Scan profiles for all on-premises providers (Docker Swarm, Kubernetes, Synology, Harvester, Bare Metal, Windows)
- Pairs of scan results for diff testing
- Authentik configuration resources
- Windows-specific resources (services, IIS sites, scheduled tasks, Hyper-V VMs)
Property test coverage (referencing design properties):
- Property 1–5: Scanner behavior properties
- Property 6–10: Dependency resolution and reference properties
- Property 11–15: Code generation properties
- Property 16–17: State building properties
- Property 18–21: Multi-provider, configuration, and filtering properties
- Property 22–26: Incremental scan and validation properties
Integration Tests
Integration tests verify end-to-end behavior with mocked platform APIs:
- Full pipeline: scan → resolve → generate → build state → validate
- Multi-provider merge with real-ish resource structures from different platform types
- Terraform validation (requires terraform binary)
- Incremental scan with stored snapshots
- Error recovery: connection loss mid-scan, terraform validation failures
- Authentik SSO flow (mocked Authentik instance)
- Architecture-aware code generation (mixed AMD64/AArch64 environments)
- Platform-specific discovery patterns (container vs storage vs HCI vs Windows)
- Windows discovery via mocked WinRM (services, IIS, scheduled tasks, Hyper-V)
Test Organization
tests/
├── unit/
│ ├── test_identifier_sanitization.py
│ ├── test_hcl_templates.py
│ ├── test_state_format.py
│ ├── test_config_validation.py
│ ├── test_architecture_detection.py
│ ├── test_platform_category.py
│ └── test_windows_resource_parsing.py
├── property/
│ ├── test_scanner_properties.py
│ ├── test_dependency_resolver_properties.py
│ ├── test_code_generator_properties.py
│ ├── test_state_builder_properties.py
│ ├── test_incremental_scan_properties.py
│ └── strategies.py # Custom Hypothesis strategies
└── integration/
├── test_full_pipeline.py
├── test_multi_provider.py
├── test_terraform_validation.py
├── test_authentik_sso.py
└── mocks/
├── docker_swarm_mock.py
├── kubernetes_mock.py
├── synology_mock.py
├── harvester_mock.py
├── bare_metal_mock.py
├── windows_mock.py
└── authentik_mock.py