diff --git a/.kiro/specs/iac-reverse-engineering/design.md b/.kiro/specs/iac-reverse-engineering/design.md new file mode 100644 index 0000000..acf0c04 --- /dev/null +++ b/.kiro/specs/iac-reverse-engineering/design.md @@ -0,0 +1,1122 @@ +# Design Document: IaC Reverse Engineering + +## Overview + +This design describes a CLI tool that reverse-engineers existing on-premises infrastructure into well-structured Terraform HCL code and state files. The tool connects to on-premises platform APIs (Docker Swarm, Kubernetes, Synology Disk Station, SUSE Harvester, Windows machines, and bare metal servers), discovers deployed resources, resolves inter-resource dependencies, generates idiomatic Terraform code organized by resource type, and produces a valid state file so Terraform recognizes existing resources without attempting recreation. + +The tool is designed exclusively for on-premises environments — no cloud provider support is included. It handles the unique characteristics of different platform types: container orchestration (Docker Swarm, Kubernetes), storage appliances (Synology), HCI (SUSE Harvester), Windows machines, and bare metal servers. All resources are tracked with CPU architecture awareness (ARM, AMD64, AArch64) to support heterogeneous infrastructure environments. + +The target infrastructure consists of: +- **Raspberry Pi cluster (ARM/AArch64)** — Kubernetes and Docker Swarm nodes for container orchestration +- **Dell PowerEdge servers (AMD64)** — SUSE Harvester HCI nodes providing virtualization and storage +- **Synology NAS** — Storage appliance for shared storage, backups, and media +- **Standalone Windows machines** — Running various services (IIS, scheduled tasks, Hyper-V VMs) +- **Authentik** — Identity provider for SSO across all managed infrastructure + +Authentication and SSO are handled through Authentik, which serves as the identity provider for the tool itself and is also discoverable as managed infrastructure (Authentik configurations, flows, providers, and applications can be reverse-engineered into IaC). + +The tool is implemented in Python for its rich ecosystem of infrastructure libraries (kubernetes-client, docker-sdk, pywinrm), rapid development cycle, and strong support for graph algorithms. It follows a pipeline architecture where each stage transforms data from the previous stage, enabling clear separation of concerns and independent testability. + +### Key Design Decisions + +1. **Python as implementation language** — Rich infrastructure SDK ecosystem (kubernetes-client, docker-sdk-python, pywinrm for Windows, python-synology), strong graph libraries (networkx), and Jinja2 for HCL templating. +2. **Pipeline architecture** — Each component (Scanner → Dependency Resolver → Code Generator → State Builder) operates on well-defined data structures, enabling independent testing and extension. +3. **Provider plugin system** — Each on-premises platform is implemented as a plugin conforming to a common interface, making it straightforward to add new platforms. +4. **Platform type categorization** — Providers are categorized by platform type (container orchestration, storage appliance, HCI, windows, bare metal) to handle their distinct resource models and discovery patterns. +5. **CPU architecture tracking** — Every discovered resource carries architecture metadata (ARM, AMD64, AArch64) enabling architecture-aware code generation and resource organization. +6. **Authentik as identity provider** — The tool authenticates users via Authentik SSO, and Authentik itself is a discoverable infrastructure target whose configurations can be reverse-engineered into IaC. +7. **Terraform state format v4** — Direct JSON generation of state files rather than relying on `terraform import` for each resource, enabling bulk operations. +8. **Incremental scan via snapshot diffing** — Store scan results as timestamped JSON snapshots and compute diffs for incremental updates. +9. **Windows discovery via WinRM/WMI** — Uses pywinrm library to connect to Windows machines and discover services, scheduled tasks, IIS sites, network configuration, installed software, Windows features, and Hyper-V VMs. + +## Architecture + +The system follows a staged pipeline architecture with clear data flow between components: + +```mermaid +graph TD + A[Scan Profile Config] --> B[Scanner] + B --> C[Resource Inventory] + C --> D[Dependency Resolver] + D --> E[Dependency Graph] + E --> F[Code Generator] + F --> G[HCL Files] + E --> H[State Builder] + H --> I[State File] + G --> J[Validator] + I --> J + J --> K[Validation Report] + + subgraph "Provider Plugins" + B --> P3[Docker Swarm Plugin] + B --> P4[Kubernetes Plugin] + B --> P5[Synology Plugin] + B --> P6[Harvester Plugin] + B --> P7[Bare Metal Plugin] + B --> P8[Windows Plugin] + end + + subgraph "Authentication" + AU[Authentik SSO] --> B + AU --> AUD[Authentik Discovery Plugin] + end + + subgraph "Incremental Scan" + L[Previous Snapshot] --> M[Diff Engine] + C --> M + M --> N[Change Set] + end +``` + +### Component Interaction Flow + +```mermaid +sequenceDiagram + participant User + participant Authentik + participant CLI + participant Scanner + participant DependencyResolver + participant CodeGenerator + participant StateBuilder + participant Validator + + User->>Authentik: Authenticate via SSO + Authentik-->>CLI: OAuth2/OIDC token + User->>CLI: Provide Scan Profile + CLI->>Scanner: Start discovery + Scanner->>Scanner: Connect to platform API + Scanner->>Scanner: Enumerate resources + Scanner->>Scanner: Detect CPU architecture + Scanner-->>CLI: Progress updates + Scanner->>DependencyResolver: Resource Inventory + DependencyResolver->>DependencyResolver: Build dependency graph + DependencyResolver->>DependencyResolver: Detect cycles + DependencyResolver->>CodeGenerator: Dependency Graph + CodeGenerator->>CodeGenerator: Generate HCL files + CodeGenerator->>CodeGenerator: Extract variables + CodeGenerator->>CodeGenerator: Apply architecture tags + CodeGenerator->>StateBuilder: Resource mappings + StateBuilder->>StateBuilder: Build state entries + StateBuilder->>Validator: State file + CodeGenerator->>Validator: HCL files + Validator->>Validator: terraform init/validate/plan + Validator-->>User: Validation report +``` + +## Components and Interfaces + +### 1. Scanner + +The Scanner is responsible for connecting to on-premises platform APIs and discovering resources. Each platform type has distinct discovery patterns. + +```python +from abc import ABC, abstractmethod +from dataclasses import dataclass, field +from typing import Optional +from enum import Enum + + +class ProviderType(Enum): + DOCKER_SWARM = "docker_swarm" + KUBERNETES = "kubernetes" + SYNOLOGY = "synology" + HARVESTER = "harvester" + BARE_METAL = "bare_metal" + WINDOWS = "windows" + + +class PlatformCategory(Enum): + """Categorizes providers by their infrastructure model.""" + CONTAINER_ORCHESTRATION = "container" # Docker Swarm, Kubernetes + STORAGE_APPLIANCE = "storage" # Synology Disk Station + HCI = "hci" # SUSE Harvester (Hyper-Converged Infrastructure) + BARE_METAL = "bare_metal" # Physical servers (Linux) + WINDOWS = "windows" # Standalone Windows machines + + +PROVIDER_PLATFORM_MAP: dict[ProviderType, PlatformCategory] = { + ProviderType.DOCKER_SWARM: PlatformCategory.CONTAINER_ORCHESTRATION, + ProviderType.KUBERNETES: PlatformCategory.CONTAINER_ORCHESTRATION, + ProviderType.SYNOLOGY: PlatformCategory.STORAGE_APPLIANCE, + ProviderType.HARVESTER: PlatformCategory.HCI, + ProviderType.BARE_METAL: PlatformCategory.BARE_METAL, + ProviderType.WINDOWS: PlatformCategory.WINDOWS, +} + + +class CpuArchitecture(Enum): + """CPU architecture of the host or resource.""" + AMD64 = "amd64" + ARM = "arm" + AARCH64 = "aarch64" + + +@dataclass +class ScanProfile: + provider: ProviderType + credentials: dict[str, str] # Provider-specific auth (API tokens, usernames, etc.) + endpoints: Optional[list[str]] = None # API endpoints / host addresses + resource_type_filters: Optional[list[str]] = None # None means all types + authentik_token: Optional[str] = None # SSO token from Authentik + + def validate(self) -> list[str]: + """Returns list of validation errors, empty if valid.""" + ... + + @property + def platform_category(self) -> PlatformCategory: + return PROVIDER_PLATFORM_MAP[self.provider] + + +@dataclass +class DiscoveredResource: + resource_type: str # e.g., "kubernetes_deployment", "windows_iis_site" + unique_id: str # Provider-assigned unique identifier + name: str # Human-readable name or tag + provider: ProviderType + platform_category: PlatformCategory + architecture: CpuArchitecture # CPU architecture of the resource/host + endpoint: str # Which API endpoint this was discovered from + attributes: dict # Full configuration attributes + raw_references: list[str] # IDs referenced by this resource (pre-resolution) + + +@dataclass +class ScanResult: + resources: list[DiscoveredResource] + warnings: list[str] + errors: list[str] + scan_timestamp: str + profile_hash: str # Hash of scan profile for matching incremental scans + is_partial: bool = False # True if scan was interrupted + + +@dataclass +class ScanProgress: + current_resource_type: str + resources_discovered: int + resource_types_completed: int + total_resource_types: int + + +class ProviderPlugin(ABC): + """Interface that all provider plugins must implement.""" + + @abstractmethod + def authenticate(self, credentials: dict[str, str]) -> None: + """Authenticate with the platform API. Raises AuthenticationError on failure.""" + ... + + @abstractmethod + def get_platform_category(self) -> PlatformCategory: + """Return the platform category for this provider.""" + ... + + @abstractmethod + def list_endpoints(self) -> list[str]: + """Return all reachable endpoints/hosts for this provider.""" + ... + + @abstractmethod + def list_supported_resource_types(self) -> list[str]: + """Return all resource types this plugin can discover.""" + ... + + @abstractmethod + def detect_architecture(self, endpoint: str) -> CpuArchitecture: + """Detect the CPU architecture of the target host/node.""" + ... + + @abstractmethod + def discover_resources( + self, + endpoints: list[str], + resource_types: list[str], + progress_callback: callable + ) -> ScanResult: + """Discover resources. Calls progress_callback with ScanProgress updates.""" + ... +``` + +### 2. Windows Provider Plugin + +The Windows plugin discovers infrastructure on standalone Windows machines via WinRM/WMI using the pywinrm library. + +```python +class WindowsDiscoveryPlugin(ProviderPlugin): + """Discovers Windows machine configurations via WinRM/WMI. + + Discovers: Windows services, scheduled tasks, IIS sites/app pools, + network configuration, installed software, Windows features, + and Hyper-V VMs (if Hyper-V role is present). + + Uses pywinrm for connectivity and WMI/CIM queries for discovery. + """ + + def get_platform_category(self) -> PlatformCategory: + return PlatformCategory.WINDOWS + + def list_supported_resource_types(self) -> list[str]: + return [ + "windows_service", + "windows_scheduled_task", + "windows_iis_site", + "windows_iis_app_pool", + "windows_network_adapter", + "windows_firewall_rule", + "windows_installed_software", + "windows_feature", + "windows_hyperv_vm", + "windows_hyperv_switch", + "windows_dns_record", + "windows_local_user", + "windows_local_group", + ] + + def authenticate(self, credentials: dict[str, str]) -> None: + """Authenticate via WinRM using NTLM or Kerberos. + + Expected credentials: + - host: Target Windows machine hostname/IP + - username: Windows username (DOMAIN\\user or user@domain) + - password: Windows password + - transport: "ntlm" (default) or "kerberos" + - port: WinRM port (default 5985 for HTTP, 5986 for HTTPS) + - use_ssl: "true" or "false" (default "true") + """ + ... + + def detect_architecture(self, endpoint: str) -> CpuArchitecture: + """Detect architecture via WMI Win32_Processor query.""" + ... + + def discover_resources( + self, + endpoints: list[str], + resource_types: list[str], + progress_callback: callable + ) -> ScanResult: + """Discover Windows resources via WinRM/WMI queries. + + Uses CIM sessions for efficient bulk queries. + Discovers Hyper-V resources only if the Hyper-V role is installed. + """ + ... +``` + +### 3. Authentik Integration + +Authentik serves dual roles: authenticating users of the tool via SSO, and being a discoverable infrastructure target. + +```python +@dataclass +class AuthentikConfig: + base_url: str # Authentik instance URL + client_id: str # OAuth2 client ID for this tool + client_secret: str # OAuth2 client secret + +@dataclass +class AuthentikSession: + access_token: str + refresh_token: str + user_id: str + groups: list[str] + +class AuthentikAuthProvider: + """Handles SSO authentication for the tool itself.""" + + def authenticate_user(self, config: AuthentikConfig) -> AuthentikSession: + """Initiate OAuth2/OIDC flow with Authentik. Returns session on success.""" + ... + + def refresh_session(self, session: AuthentikSession) -> AuthentikSession: + """Refresh an expired session token.""" + ... + + def validate_token(self, token: str) -> bool: + """Validate an existing token is still valid.""" + ... + + +class AuthentikDiscoveryPlugin(ProviderPlugin): + """Discovers Authentik configurations as infrastructure resources. + + Discovers: flows, stages, providers, applications, outposts, + property mappings, certificates, and SSO integrations with + other managed platforms. + """ + + def list_supported_resource_types(self) -> list[str]: + return [ + "authentik_flow", + "authentik_stage", + "authentik_provider", + "authentik_application", + "authentik_outpost", + "authentik_property_mapping", + "authentik_certificate", + "authentik_group", + "authentik_source", + ] + ... +``` + +### 4. Dependency Resolver + +Analyzes resource relationships and produces a topological ordering. + +```python +@dataclass +class ResourceRelationship: + source_id: str # Resource that holds the reference + target_id: str # Resource being referenced + relationship_type: str # "parent-child", "reference", "dependency" + source_attribute: str # Attribute in source that holds the reference + +@dataclass +class DependencyGraph: + resources: list[DiscoveredResource] + relationships: list[ResourceRelationship] + topological_order: list[str] # Resource IDs in dependency order + cycles: list[list[str]] # Detected cycles (list of resource ID chains) + unresolved_references: list[UnresolvedReference] + +@dataclass +class UnresolvedReference: + source_resource_id: str + source_attribute: str + referenced_id: str # The ID that couldn't be resolved + suggested_resolution: str # "data_source" or "variable" + +class DependencyResolverInterface: + def resolve(self, inventory: ScanResult) -> DependencyGraph: + """Analyze relationships and produce dependency graph.""" + ... + + def detect_cycles(self, graph: DependencyGraph) -> list[CycleReport]: + """Detect and report circular dependencies with resolution suggestions.""" + ... +``` + +### 5. Code Generator + +Produces Terraform HCL files from the dependency graph. Architecture-aware: generates architecture tags and organizes resources by platform category. + +```python +@dataclass +class GeneratedFile: + filename: str # e.g., "kubernetes_deployment.tf", "windows_service.tf" + content: str # HCL content + resource_count: int + +@dataclass +class ExtractedVariable: + name: str # Variable name + type_expr: str # Terraform type expression + default_value: str # Most common value + description: str + used_by: list[str] # Resource IDs using this variable + +@dataclass +class CodeGenerationResult: + resource_files: list[GeneratedFile] + variables_file: GeneratedFile + provider_file: GeneratedFile + outputs_file: Optional[GeneratedFile] + skipped_resources: list[tuple[str, str]] # (resource_id, reason) + +class CodeGeneratorInterface: + def generate(self, graph: DependencyGraph, profiles: list[ScanProfile]) -> CodeGenerationResult: + """Generate Terraform HCL from dependency graph. + + Architecture-aware: includes architecture tags/labels on resources, + organizes provider blocks by platform category. + """ + ... + + def sanitize_identifier(self, name: str) -> str: + """Convert resource name to valid Terraform identifier.""" + ... + + def extract_variables(self, resources: list[DiscoveredResource]) -> list[ExtractedVariable]: + """Identify common values to extract as variables.""" + ... + + def generate_architecture_tags(self, resource: DiscoveredResource) -> dict[str, str]: + """Generate architecture-specific tags/labels for a resource.""" + ... +``` + +### 6. State Builder + +Generates Terraform state file (format version 4). + +```python +@dataclass +class StateEntry: + resource_type: str + resource_name: str # Terraform identifier name + provider_id: str # Provider-assigned unique ID + attributes: dict # Full attribute set + sensitive_attributes: list[str] + schema_version: int + dependencies: list[str] # Terraform resource addresses of dependencies + +@dataclass +class StateFile: + version: int = 4 + terraform_version: str = "" + serial: int = 1 + lineage: str = "" # UUID + resources: list[StateEntry] = field(default_factory=list) + + def to_json(self) -> str: + """Serialize to Terraform state JSON format.""" + ... + +class StateBuilderInterface: + def build(self, code_result: CodeGenerationResult, graph: DependencyGraph, provider_version: str) -> StateFile: + """Build state file from generated code and dependency graph.""" + ... +``` + +### 7. Validator + +Runs Terraform commands to validate generated output. + +```python +@dataclass +class ValidationResult: + init_success: bool + validate_success: bool + plan_success: bool + planned_changes: list[PlannedChange] + errors: list[ValidationError] + correction_attempts: int + +@dataclass +class PlannedChange: + resource_address: str + change_type: str # "add", "modify", "destroy" + details: str + +@dataclass +class ValidationError: + file: str + message: str + line: Optional[int] = None + +class ValidatorInterface: + def validate(self, output_dir: str, max_correction_attempts: int = 3) -> ValidationResult: + """Run terraform init, validate, and plan. Attempt corrections if needed.""" + ... +``` + +### 8. Incremental Scan Engine + +Compares current scan results against previous snapshots. + +```python +class ChangeType(Enum): + ADDED = "added" + REMOVED = "removed" + MODIFIED = "modified" + +@dataclass +class ResourceChange: + resource_id: str + resource_type: str + resource_name: str + change_type: ChangeType + changed_attributes: Optional[dict] = None # For MODIFIED, old->new values + +@dataclass +class ChangeSummary: + added_count: int + removed_count: int + modified_count: int + changes: list[ResourceChange] + +class IncrementalScanEngine: + def compare(self, current: ScanResult, previous: ScanResult) -> ChangeSummary: + """Compare two scan results and classify changes.""" + ... + + def store_snapshot(self, result: ScanResult, profile_hash: str) -> None: + """Persist scan result for future comparison.""" + ... + + def load_previous(self, profile_hash: str) -> Optional[ScanResult]: + """Load most recent previous scan for this profile.""" + ... +``` + +## Data Models + +### Platform Type Differentiation + +Each provider type maps to a platform category that determines discovery patterns: + +| Platform Category | Providers | Resource Model | Discovery Pattern | +|---|---|---|---| +| Container Orchestration | Docker Swarm, Kubernetes | Services, deployments, pods, volumes, networks, configs | Docker/K8s API listing of workloads, services, and cluster resources | +| Storage Appliance | Synology Disk Station | Volumes, shares, pools, replication tasks, users | Synology DSM API for storage pools, shared folders, packages | +| HCI | SUSE Harvester | VMs, volumes, images, networks (combines hypervisor + storage) | Harvester/K8s-based API for VM and storage resources | +| Bare Metal | Physical servers (Linux) | Hardware inventory, IPMI/BMC configs, network interfaces, RAID | IPMI/Redfish API for hardware discovery, network config | +| Windows | Standalone Windows machines | Services, scheduled tasks, IIS sites, network config, software, features, Hyper-V VMs | WinRM/WMI queries via pywinrm for system configuration discovery | + +### CPU Architecture Model + +Architecture is tracked at the host/node level and inherited by resources running on that host: + +| Architecture | Description | Common Platforms | +|---|---|---| +| AMD64 | x86-64 / Intel 64 | Dell PowerEdge servers (Harvester nodes), Windows machines | +| ARM | 32-bit ARM | Older embedded devices, some Synology NAS models | +| AArch64 | 64-bit ARM (ARMv8+) | Raspberry Pi cluster nodes (K8s/Docker Swarm), some Synology models | + +### Scan Profile Configuration (YAML) + +```yaml +# scan_profile.yaml - Kubernetes example (Raspberry Pi cluster) +provider: kubernetes +credentials: + kubeconfig_path: "${HOME}/.kube/config" + context: "pi-cluster" +endpoints: + - "https://k8s-api.internal.lab:6443" +resource_type_filters: + - kubernetes_deployment + - kubernetes_service + - kubernetes_ingress + - kubernetes_config_map + - kubernetes_persistent_volume +authentik: + base_url: "https://auth.internal.lab" + client_id: "iac-reverse-tool" +``` + +```yaml +# scan_profile.yaml - Synology NAS example +provider: synology +credentials: + host: "nas01.internal.lab" + port: 5001 + username: "${SYNOLOGY_USER}" + password: "${SYNOLOGY_PASSWORD}" +endpoints: + - "nas01.internal.lab:5001" +resource_type_filters: + - synology_shared_folder + - synology_volume + - synology_storage_pool +``` + +```yaml +# scan_profile.yaml - Windows machine example +provider: windows +credentials: + host: "win-server-01.internal.lab" + username: "${WINDOWS_USER}" + password: "${WINDOWS_PASSWORD}" + transport: "ntlm" + use_ssl: "true" + port: "5986" +endpoints: + - "win-server-01.internal.lab" +resource_type_filters: + - windows_service + - windows_scheduled_task + - windows_iis_site + - windows_iis_app_pool + - windows_feature + - windows_hyperv_vm +``` + +```yaml +# scan_profile.yaml - SUSE Harvester example (Dell PowerEdge) +provider: harvester +credentials: + kubeconfig_path: "${HOME}/.kube/harvester-config" + context: "harvester-cluster" +endpoints: + - "https://harvester.internal.lab:6443" +resource_type_filters: + - harvester_virtualmachine + - harvester_volume + - harvester_image + - harvester_network +``` + +### Resource Inventory (Internal JSON) + +```json +{ + "scan_timestamp": "2024-01-15T10:30:00Z", + "profile_hash": "a1b2c3d4", + "is_partial": false, + "resources": [ + { + "resource_type": "kubernetes_deployment", + "unique_id": "apps/v1/deployments/default/nginx", + "name": "nginx", + "provider": "kubernetes", + "platform_category": "container", + "architecture": "aarch64", + "endpoint": "https://k8s-api.internal.lab:6443", + "attributes": { + "namespace": "default", + "replicas": 3, + "image": "nginx:1.25", + "node_selector": {"kubernetes.io/arch": "arm64"}, + "labels": {"app": "nginx", "arch": "aarch64"} + }, + "raw_references": ["default/services/nginx-svc"] + }, + { + "resource_type": "windows_iis_site", + "unique_id": "win-server-01/iis/sites/Default Web Site", + "name": "Default Web Site", + "provider": "windows", + "platform_category": "windows", + "architecture": "amd64", + "endpoint": "win-server-01.internal.lab", + "attributes": { + "site_name": "Default Web Site", + "physical_path": "C:\\inetpub\\wwwroot", + "bindings": [ + {"protocol": "https", "port": 443, "hostname": "app.internal.lab"} + ], + "app_pool": "DefaultAppPool", + "state": "Started" + }, + "raw_references": ["win-server-01/iis/app_pools/DefaultAppPool"] + }, + { + "resource_type": "harvester_virtualmachine", + "unique_id": "harvester/vms/default/ubuntu-dev-01", + "name": "ubuntu-dev-01", + "provider": "harvester", + "platform_category": "hci", + "architecture": "amd64", + "endpoint": "https://harvester.internal.lab:6443", + "attributes": { + "namespace": "default", + "cpu": 4, + "memory": "8Gi", + "disk_size": "100Gi", + "network": "vlan-100", + "image": "ubuntu-22.04-server" + }, + "raw_references": ["harvester/images/ubuntu-22.04-server", "harvester/networks/vlan-100"] + } + ], + "warnings": [], + "errors": [] +} +``` + +### Terraform State File (Output Format v4) + +```json +{ + "version": 4, + "terraform_version": "1.7.0", + "serial": 1, + "lineage": "a1b2c3d4-e5f6-7890-abcd-ef1234567890", + "outputs": {}, + "resources": [ + { + "mode": "managed", + "type": "kubernetes_deployment", + "name": "nginx", + "provider": "provider[\"registry.terraform.io/hashicorp/kubernetes\"]", + "instances": [ + { + "schema_version": 1, + "attributes": { + "id": "apps/v1/deployments/default/nginx", + "metadata": { + "name": "nginx", + "namespace": "default", + "labels": {"app": "nginx", "arch": "aarch64"} + }, + "spec": { + "replicas": 3, + "template": { + "spec": { + "container": [{"image": "nginx:1.25"}], + "node_selector": {"kubernetes.io/arch": "arm64"} + } + } + } + }, + "sensitive_attributes": [], + "dependencies": [ + "kubernetes_service.nginx_svc" + ] + } + ] + } + ] +} +``` + +### Dependency Graph (Internal) + +```json +{ + "nodes": ["apps/v1/deployments/default/nginx", "default/services/nginx-svc", "win-server-01/iis/sites/Default Web Site", "win-server-01/iis/app_pools/DefaultAppPool"], + "edges": [ + {"source": "apps/v1/deployments/default/nginx", "target": "default/services/nginx-svc", "type": "reference", "attribute": "service_name"}, + {"source": "win-server-01/iis/sites/Default Web Site", "target": "win-server-01/iis/app_pools/DefaultAppPool", "type": "dependency", "attribute": "app_pool"} + ], + "topological_order": ["default/services/nginx-svc", "apps/v1/deployments/default/nginx", "win-server-01/iis/app_pools/DefaultAppPool", "win-server-01/iis/sites/Default Web Site"], + "cycles": [], + "unresolved_references": [] +} +``` + +### Scan Snapshot Storage + +Snapshots are stored as JSON files in a `.iac-reverse/snapshots/` directory: + +``` +.iac-reverse/ +├── snapshots/ +│ ├── a1b2c3d4_2024-01-15T10-30-00Z.json +│ └── a1b2c3d4_2024-01-14T09-00-00Z.json +└── config/ + └── scan_profiles/ +``` + + + +## Correctness Properties + +*A property is a characteristic or behavior that should hold true across all valid executions of a system — essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.* + +### Property 1: Resource inventory completeness + +*For any* discovered resource from any on-premises provider (Docker Swarm, Kubernetes, Synology, Harvester, Bare Metal, Windows), the resulting inventory entry SHALL contain non-empty values for resource_type, unique_id, name, provider, platform_category, architecture, and attributes fields. + +**Validates: Requirements 1.2** + +### Property 2: Authentication error descriptiveness + +*For any* provider type and any authentication failure reason, the error returned by the Scanner SHALL contain both the provider name string and the failure reason string. + +**Validates: Requirements 1.3** + +### Property 3: Graceful degradation on unsupported resource types + +*For any* scan request containing a mix of supported and unsupported resource types, the Scanner SHALL produce warnings for each unsupported type AND return a complete inventory for all supported types (the presence of unsupported types does not reduce the discovered set of supported resources). + +**Validates: Requirements 1.4** + +### Property 4: Progress reporting frequency + +*For any* scan across N resource types, the progress callback SHALL be invoked at least N times, once per resource type completion, with monotonically increasing discovered resource counts. + +**Validates: Requirements 1.5** + +### Property 5: Partial inventory preservation on failure + +*For any* scan that is interrupted at an arbitrary point, the partial inventory SHALL contain exactly the set of resources that were successfully discovered before the failure point, with no duplicates and no resources from after the failure. + +**Validates: Requirements 1.7** + +### Property 6: Dependency relationship identification + +*For any* resource inventory where resource A's attributes contain resource B's unique identifier, the Dependency Resolver SHALL produce a relationship edge from A to B with the correct relationship type and source attribute. + +**Validates: Requirements 2.1** + +### Property 7: Cycle detection correctness + +*For any* dependency graph containing a cycle, the Dependency Resolver SHALL report the cycle listing all resources involved. *For any* acyclic dependency graph, the Dependency Resolver SHALL report zero cycles. + +**Validates: Requirements 2.3** + +### Property 8: Topological order validity + +*For any* acyclic dependency graph, the topological order produced by the Dependency Resolver SHALL satisfy the constraint that for every edge (A depends on B), B appears before A in the ordering. + +**Validates: Requirements 2.4** + +### Property 9: Unresolved references become data sources or variables + +*For any* resource that references an identifier not present in the current inventory, the generated output SHALL represent that reference as a data source lookup or variable — never as a hardcoded literal identifier string. + +**Validates: Requirements 2.5** + +### Property 10: References in generated output use Terraform syntax + +*For any* resource that references another resource present in the inventory, the generated HCL SHALL use Terraform resource reference expressions (e.g., `kubernetes_service.name.id`) rather than hardcoded identifier strings. + +**Validates: Requirements 2.2, 3.5** + +### Property 11: Generated HCL syntactic validity + +*For any* valid resource inventory and dependency graph, the Code Generator SHALL produce output that parses as syntactically valid HCL (no syntax errors when parsed by an HCL parser). + +**Validates: Requirements 3.1** + +### Property 12: File organization by resource type + +*For any* resource inventory containing resources of N distinct types, the Code Generator SHALL produce exactly N resource files, where each file contains only resource blocks of its designated type and every resource appears in exactly one file. + +**Validates: Requirements 3.2** + +### Property 13: Variable extraction for shared values + +*For any* attribute value that appears in 2 or more resources in the inventory, the Code Generator SHALL extract that value into a Terraform variable with a default set to the most commonly occurring value. + +**Validates: Requirements 3.3** + +### Property 14: Identifier sanitization validity + +*For any* input string (including strings with special characters, unicode, leading digits, or spaces), the sanitize_identifier function SHALL produce a non-empty string matching the regex `^[a-zA-Z_][a-zA-Z0-9_]*$`. + +**Validates: Requirements 3.4** + +### Property 15: Traceability comments in generated code + +*For any* generated resource block, the output SHALL contain a comment including the original provider-assigned unique resource identifier for traceability. + +**Validates: Requirements 3.6** + +### Property 16: State file structural validity + +*For any* set of generated resources, the State Builder SHALL produce a JSON document with version=4, a valid UUID lineage, and a resources array where each entry has mode, type, name, provider, and instances fields conforming to Terraform state v4 schema. + +**Validates: Requirements 4.1** + +### Property 17: State entry completeness and schema correctness + +*For any* resource with a known provider schema version and known sensitive attributes, the state entry SHALL have schema_version matching the provider version, contain all discovered attributes, and mark exactly the sensitive attributes as sensitive. + +**Validates: Requirements 4.4, 4.5** + +### Property 18: Multi-provider merge with naming conflict resolution + +*For any* two or more resource inventories from different on-premises providers where resource names collide, the merged inventory SHALL contain all resources from all providers, with conflicting names prefixed by the provider identifier, and no resources lost. + +**Validates: Requirements 5.3** + +### Property 19: Provider block generation + +*For any* resource set spanning N distinct on-premises providers, the generated provider configuration SHALL contain exactly N provider blocks, one per distinct provider. + +**Validates: Requirements 5.4** + +### Property 20: Scan profile validation completeness + +*For any* scan profile with K invalid fields (missing provider, empty credentials, unreachable endpoints, filters exceeding 200 entries, or unsupported resource types), the validation error SHALL list all K invalid fields in a single response. + +**Validates: Requirements 6.1, 6.6, 6.7** + +### Property 21: Filtering correctness + +*For any* scan profile with resource type filters and/or endpoint filters, the discovered resources SHALL be a subset where every resource's type is in the filter list (if specified) AND every resource's endpoint is in the endpoint list (if specified). No resource outside the filter criteria shall appear. + +**Validates: Requirements 6.2, 6.4** + +### Property 22: Drift report correctness + +*For any* terraform plan output containing N planned changes, the drift report SHALL list exactly N entries, each with the correct resource address and change type (add, modify, or destroy). + +**Validates: Requirements 7.3** + +### Property 23: Change classification correctness + +*For any* pair of scan results (previous and current), every resource SHALL be classified exactly once as: added (in current but not previous), removed (in previous but not current), or modified (in both but with differing attributes). The summary counts SHALL equal the actual number of resources in each category. + +**Validates: Requirements 8.1, 8.5** + +### Property 24: Incremental update scope + +*For any* change set applied to existing IaC files, only files containing added, modified, or removed resources SHALL be modified. Files containing only unchanged resources SHALL remain identical. + +**Validates: Requirements 8.2** + +### Property 25: Removed resource exclusion + +*For any* resource classified as removed, the updated IaC output SHALL not contain a resource block for that resource, AND the updated state file SHALL not contain a state entry for that resource. + +**Validates: Requirements 8.3** + +### Property 26: Snapshot retention + +*For any* sequence of N scans (N ≥ 2) for the same Scan_Profile, at least the two most recent scan results SHALL be retained in storage after each scan completes. + +**Validates: Requirements 8.6** + +## Error Handling + +### Error Categories + +| Category | Examples | Handling Strategy | +|----------|----------|-------------------| +| Authentication Failure | Invalid API tokens, expired credentials, Authentik SSO token expired, WinRM auth failure, insufficient permissions | Return descriptive error with provider name and reason. Do not retry. | +| Transient API Error | Rate limiting, timeout, temporary platform unavailability, WinRM connection timeout | Retry up to 3 times with exponential backoff. Log warning if all retries fail. | +| Connection Loss | Network partition, platform host unreachable, API endpoint down, WinRM session dropped | Return partial results with error indicating failure point. | +| Validation Error | Invalid scan profile, unsupported resource type, unreachable endpoint | Return all validation errors in a single response before attempting connection. | +| Generation Error | Unconvertible resource, missing attributes, unsupported architecture | Skip affected resource, log warning, continue with remaining resources. | +| External Tool Error | Terraform binary not found, terraform command failure | Report error with command name and failure details. | +| Authentik Error | SSO flow failure, token refresh failure, Authentik instance unreachable | Report authentication error, prompt re-authentication. | +| Windows-Specific Error | WinRM not enabled, WMI query failure, insufficient privileges, Hyper-V role not installed | Log warning for missing features, skip unavailable resource types, continue discovery. | + +### Error Propagation + +```mermaid +graph TD + A[Platform API Error] -->|Transient| B[Retry up to 3x] + A -->|Permanent| C[Log warning, skip resource] + B -->|All retries fail| C + A -->|Connection lost| D[Return partial inventory] + + E[Validation Error] --> F[Collect all errors] + F --> G[Return before execution] + + H[Generation Error] --> I[Skip resource] + I --> J[Log warning with resource ID and reason] + J --> K[Continue generation] + + L[Terraform Error] --> M{Correctable?} + M -->|Yes| N[Attempt correction, up to 3x] + M -->|No| O[Report to user] + N -->|Still failing| O + + P[Authentik Error] --> Q{Token expired?} + Q -->|Yes| R[Attempt token refresh] + Q -->|No| S[Report auth failure] + R -->|Refresh fails| S + + T[Windows Error] --> U{Feature missing?} + U -->|Yes| V[Skip resource type, log warning] + U -->|No| W[Retry or report] +``` + +### On-Premises Connectivity Patterns + +On-premises platforms have distinct connectivity characteristics compared to cloud APIs: + +- **Direct network access required** — No public internet endpoints; the tool must have network connectivity to each platform's management interface (K8s API server, Synology DSM, Harvester dashboard, IPMI/BMC interfaces, WinRM endpoints). +- **Self-signed certificates** — Many on-prem platforms use self-signed TLS certificates. The tool must support configurable certificate verification (trust custom CA bundles or skip verification for known internal hosts). +- **Varied authentication mechanisms** — Each platform uses different auth: Kubernetes uses kubeconfig/service accounts, Synology uses session-based auth, Harvester uses K8s-style auth, bare metal uses IPMI credentials, Windows uses NTLM/Kerberos via WinRM. +- **No rate limiting (typically)** — On-prem APIs generally don't rate-limit, but may have connection limits or session caps. +- **WinRM considerations** — Windows machines require WinRM to be enabled and configured. The tool supports both HTTP (5985) and HTTPS (5986) transports, with NTLM or Kerberos authentication. + +### Retry Strategy + +- **Backoff**: Exponential with jitter — `delay = min(base * 2^attempt + random_jitter, max_delay)` +- **Base delay**: 1 second +- **Max delay**: 30 seconds +- **Max attempts**: 3 per resource +- **Idempotency**: All discovery operations are read-only, safe to retry +- **Connection timeout**: 30 seconds per endpoint (configurable per platform) +- **Certificate handling**: Configurable per scan profile (verify, skip, or custom CA path) +- **WinRM timeout**: 60 seconds per operation (WMI queries can be slow on large systems) + +### Logging Levels + +- **ERROR**: Authentication failures, connection loss, terraform binary missing, Authentik SSO failure, WinRM connection refused +- **WARNING**: Unsupported resource types, skipped resources, unmapped state entries, unresolved references, self-signed certificate warnings, Hyper-V role not installed +- **INFO**: Scan progress, resource counts, file generation, validation results, architecture detection, Windows feature availability +- **DEBUG**: Individual API calls, attribute mapping details, reference resolution steps, Authentik token lifecycle, WMI query details + +## Testing Strategy + +### Unit Tests + +Unit tests cover specific examples, edge cases, and error conditions: + +- **Identifier sanitization**: Specific edge cases (empty string, all-digits, unicode, reserved words) +- **HCL template rendering**: Specific resource types with known expected output (K8s deployments, Synology shares, Windows services, Harvester VMs) +- **State file JSON structure**: Specific entries with known expected format +- **Error message formatting**: Specific error scenarios with expected message content +- **Configuration validation**: Specific invalid profiles with expected error lists +- **Architecture detection**: Specific platform responses mapped to correct CpuArchitecture values +- **Platform category mapping**: Verify each provider maps to correct PlatformCategory +- **Windows resource parsing**: Specific WMI query results mapped to correct resource structures +- **WinRM credential validation**: Specific credential formats (NTLM, Kerberos) validated correctly + +### Property-Based Tests + +Property-based tests verify universal properties across randomly generated inputs. This feature is well-suited to PBT because it involves: +- Pure data transformations (resource → HCL, resource → state entry) +- Graph algorithms (topological sort, cycle detection) +- String sanitization (arbitrary input → valid identifier) +- Set operations (filtering, diffing, merging) + +**Library**: [Hypothesis](https://hypothesis.readthedocs.io/) (Python PBT framework) + +**Configuration**: +- Minimum 100 iterations per property test +- Each test tagged with: `Feature: iac-reverse-engineering, Property {number}: {property_text}` +- Custom strategies for generating: + - `DiscoveredResource` instances with valid and edge-case attributes across all platform types + - Resources with varying `CpuArchitecture` values (AMD64, ARM, AArch64) + - Dependency graphs (both acyclic and cyclic) + - Scan profiles for all on-premises providers (Docker Swarm, Kubernetes, Synology, Harvester, Bare Metal, Windows) + - Pairs of scan results for diff testing + - Authentik configuration resources + - Windows-specific resources (services, IIS sites, scheduled tasks, Hyper-V VMs) + +**Property test coverage** (referencing design properties): +- Property 1–5: Scanner behavior properties +- Property 6–10: Dependency resolution and reference properties +- Property 11–15: Code generation properties +- Property 16–17: State building properties +- Property 18–21: Multi-provider, configuration, and filtering properties +- Property 22–26: Incremental scan and validation properties + +### Integration Tests + +Integration tests verify end-to-end behavior with mocked platform APIs: + +- Full pipeline: scan → resolve → generate → build state → validate +- Multi-provider merge with real-ish resource structures from different platform types +- Terraform validation (requires terraform binary) +- Incremental scan with stored snapshots +- Error recovery: connection loss mid-scan, terraform validation failures +- Authentik SSO flow (mocked Authentik instance) +- Architecture-aware code generation (mixed AMD64/AArch64 environments) +- Platform-specific discovery patterns (container vs storage vs HCI vs Windows) +- Windows discovery via mocked WinRM (services, IIS, scheduled tasks, Hyper-V) + +### Test Organization + +``` +tests/ +├── unit/ +│ ├── test_identifier_sanitization.py +│ ├── test_hcl_templates.py +│ ├── test_state_format.py +│ ├── test_config_validation.py +│ ├── test_architecture_detection.py +│ ├── test_platform_category.py +│ └── test_windows_resource_parsing.py +├── property/ +│ ├── test_scanner_properties.py +│ ├── test_dependency_resolver_properties.py +│ ├── test_code_generator_properties.py +│ ├── test_state_builder_properties.py +│ ├── test_incremental_scan_properties.py +│ └── strategies.py # Custom Hypothesis strategies +└── integration/ + ├── test_full_pipeline.py + ├── test_multi_provider.py + ├── test_terraform_validation.py + ├── test_authentik_sso.py + └── mocks/ + ├── docker_swarm_mock.py + ├── kubernetes_mock.py + ├── synology_mock.py + ├── harvester_mock.py + ├── bare_metal_mock.py + ├── windows_mock.py + └── authentik_mock.py +``` diff --git a/.kiro/specs/iac-reverse-engineering/tasks.md b/.kiro/specs/iac-reverse-engineering/tasks.md new file mode 100644 index 0000000..5382e49 --- /dev/null +++ b/.kiro/specs/iac-reverse-engineering/tasks.md @@ -0,0 +1,335 @@ +# Implementation Plan: IaC Reverse Engineering + +## Overview + +Build a Python CLI tool that reverse-engineers existing on-premises infrastructure into Terraform HCL code and state files. The tool follows a pipeline architecture (Scanner → Dependency Resolver → Code Generator → State Builder → Validator) with a provider plugin system for each on-premises platform (Docker Swarm, Kubernetes, Synology, Harvester, Bare Metal, Windows, Authentik). + +## Tasks + +- [ ] 1. Set up project structure and core data models + - [ ] 1.1 Create project directory structure, pyproject.toml, and install dependencies + - Create `src/iac_reverse/` package with `__init__.py` + - Create subdirectories: `scanner/`, `resolver/`, `generator/`, `state_builder/`, `validator/`, `incremental/`, `auth/`, `cli/` + - Set up `pyproject.toml` with dependencies: kubernetes, docker, pywinrm, hypothesis, pytest, click, jinja2, networkx, pyyaml, python-synology + - Create `tests/` directory with `unit/`, `property/`, `integration/` subdirectories + - _Requirements: 1.1, 5.1, 5.2_ + + - [ ] 1.2 Define core enums, data classes, and interfaces + - Implement `ProviderType` enum (docker_swarm, kubernetes, synology, harvester, bare_metal, windows) + - Implement `PlatformCategory` enum (container_orchestration, storage_appliance, hci, bare_metal, windows) and `PROVIDER_PLATFORM_MAP` + - Implement `CpuArchitecture` enum (amd64, arm, aarch64) + - Implement `ScanProfile`, `DiscoveredResource`, `ScanResult`, `ScanProgress` dataclasses + - Implement `ResourceRelationship`, `DependencyGraph`, `UnresolvedReference` dataclasses + - Implement `GeneratedFile`, `ExtractedVariable`, `CodeGenerationResult` dataclasses + - Implement `StateEntry`, `StateFile` dataclasses + - Implement `ValidationResult`, `PlannedChange`, `ValidationError` dataclasses + - Implement `ChangeType` enum and `ResourceChange`, `ChangeSummary` dataclasses + - Define `ProviderPlugin` abstract base class with all abstract methods + - _Requirements: 1.1, 1.2, 2.1, 3.1, 4.1, 5.1, 5.2, 8.1_ + + - [ ] 1.3 Implement ScanProfile validation logic + - Validate mandatory fields: provider type and non-empty credentials + - Validate optional fields: resource_type_filters max 200 entries, endpoints list + - Validate resource types against provider's supported types + - Return all validation errors in a single response + - _Requirements: 6.1, 6.6, 6.7_ + + - [ ]* 1.4 Write property test for scan profile validation (Property 20) + - **Property 20: Scan profile validation completeness** + - **Validates: Requirements 6.1, 6.6, 6.7** + +- [ ] 2. Implement Scanner core and provider plugin system + - [ ] 2.1 Implement Scanner orchestrator with progress reporting and error handling + - Create `Scanner` class that accepts a `ScanProfile` and orchestrates discovery + - Implement connection timeout (30 seconds) and authentication error handling with descriptive messages + - Implement progress callback invocation per resource type completion + - Implement retry logic: up to 3 retries with exponential backoff for transient errors + - Implement partial inventory return on connection loss + - Implement warning logging for unsupported resource types while continuing scan + - _Requirements: 1.1, 1.3, 1.4, 1.5, 1.6, 1.7_ + + - [ ]* 2.2 Write property tests for Scanner behavior (Properties 2, 3, 4, 5) + - **Property 2: Authentication error descriptiveness** + - **Property 3: Graceful degradation on unsupported resource types** + - **Property 4: Progress reporting frequency** + - **Property 5: Partial inventory preservation on failure** + - **Validates: Requirements 1.3, 1.4, 1.5, 1.7** + + - [ ] 2.3 Implement Docker Swarm provider plugin + - Implement `DockerSwarmPlugin` using docker-sdk-python + - Discover services, networks, volumes, configs, secrets (metadata only) + - Detect architecture from node info + - _Requirements: 1.1, 1.2, 5.2_ + + - [ ] 2.4 Implement Kubernetes provider plugin + - Implement `KubernetesPlugin` using kubernetes-client + - Discover deployments, services, ingresses, config maps, persistent volumes, namespaces + - Detect architecture from node labels + - _Requirements: 1.1, 1.2, 5.2_ + + - [ ] 2.5 Implement Synology provider plugin + - Implement `SynologyPlugin` using Synology DSM API + - Discover shared folders, volumes, storage pools, replication tasks, users + - Detect architecture from system info (ARM vs AMD64) + - _Requirements: 1.1, 1.2, 5.2_ + + - [ ] 2.6 Implement Harvester provider plugin + - Implement `HarvesterPlugin` using Harvester/K8s-based API + - Discover VMs, volumes, images, networks (HCI combined resources) + - Detect architecture from node info + - _Requirements: 1.1, 1.2, 5.2_ + + - [ ] 2.7 Implement Bare Metal provider plugin + - Implement `BareMetalPlugin` using IPMI/Redfish API + - Discover hardware inventory, BMC configs, network interfaces, RAID configurations + - Detect architecture from system hardware info + - _Requirements: 1.1, 1.2, 5.2_ + + - [ ] 2.8 Implement Windows provider plugin + - Implement `WindowsDiscoveryPlugin` using pywinrm library + - Authenticate via WinRM using NTLM or Kerberos (configurable transport, port, SSL) + - Discover Windows services, scheduled tasks, IIS sites, IIS app pools, network adapters, firewall rules, installed software, Windows features, Hyper-V VMs, Hyper-V switches, DNS records, local users, local groups + - Detect CPU architecture via WMI Win32_Processor query + - Discover Hyper-V resources only if the Hyper-V role is installed; skip gracefully otherwise + - Handle WinRM-specific errors: WinRM not enabled, WMI query failure, insufficient privileges + - _Requirements: 1.1, 1.2, 5.2_ + + - [ ] 2.9 Implement Authentik integration (SSO + discovery plugin) + - Implement `AuthentikAuthProvider` for OAuth2/OIDC SSO flow (authenticate, refresh, validate) + - Implement `AuthentikDiscoveryPlugin` conforming to `ProviderPlugin` + - Discover flows, stages, providers, applications, outposts, property mappings, certificates, groups, sources + - _Requirements: 1.1, 1.2, 5.2_ + + - [ ]* 2.10 Write property test for resource inventory completeness (Property 1) + - **Property 1: Resource inventory completeness** + - **Validates: Requirements 1.2** + +- [ ] 3. Checkpoint - Ensure all tests pass + - Ensure all tests pass, ask the user if questions arise. + +- [ ] 4. Implement Dependency Resolver + - [ ] 4.1 Implement dependency resolution and graph building + - Create `DependencyResolver` class + - Analyze resource `raw_references` to identify parent-child, reference, and dependency relationships + - Build dependency graph using networkx + - Produce topological ordering of resources + - Represent relationships as explicit Terraform references (not hardcoded IDs) + - _Requirements: 2.1, 2.2, 2.4_ + + - [ ] 4.2 Implement cycle detection and resolution suggestions + - Detect circular dependencies in the graph + - Report cycles listing all involved resources + - Suggest resolution strategies (which relationship to break, data source lookup alternatives) + - _Requirements: 2.3_ + + - [ ] 4.3 Implement unresolved reference handling + - Identify references to IDs not in the current inventory + - Log warnings for unresolved references + - Represent unresolved references as data source lookups or variables in output + - _Requirements: 2.5_ + + - [ ]* 4.4 Write property tests for Dependency Resolver (Properties 6, 7, 8, 9) + - **Property 6: Dependency relationship identification** + - **Property 7: Cycle detection correctness** + - **Property 8: Topological order validity** + - **Property 9: Unresolved references become data sources or variables** + - **Validates: Requirements 2.1, 2.3, 2.4, 2.5** + +- [ ] 5. Implement Code Generator + - [ ] 5.1 Implement HCL code generation with Jinja2 templates + - Create `CodeGenerator` class + - Create Jinja2 templates for Terraform resource blocks per provider/resource type + - Generate syntactically valid HCL files from dependency graph + - Organize output: one `.tf` file per resource type + - Include traceability comments with original resource unique_id + - Use Terraform resource references for inter-resource dependencies (not hardcoded IDs) + - Generate architecture-specific tags/labels on resources + - _Requirements: 3.1, 3.2, 3.5, 3.6_ + + - [ ] 5.2 Implement identifier sanitization + - Create `sanitize_identifier()` function + - Convert resource names to valid Terraform identifiers: `^[a-zA-Z_][a-zA-Z0-9_]*$` + - Handle special characters, unicode, leading digits, spaces by replacing with underscores + - Ensure non-empty output for any input + - _Requirements: 3.4_ + + - [ ] 5.3 Implement variable extraction logic + - Identify attribute values appearing in 2+ resources + - Extract shared values into `variables.tf` with defaults set to most common value + - Generate variable declarations with type expressions and descriptions + - _Requirements: 3.3_ + + - [ ] 5.4 Implement provider configuration block generation + - Generate separate provider blocks for each distinct provider used + - Include platform-specific configuration (endpoints, certificate settings) + - _Requirements: 5.4_ + + - [ ] 5.5 Implement multi-provider resource merging with conflict resolution + - Merge resources from multiple scan profiles into unified inventory + - Resolve naming conflicts by prefixing with provider identifier + - Preserve provider-specific attributes + - _Requirements: 5.3_ + + - [ ]* 5.6 Write property tests for Code Generator (Properties 10, 11, 12, 13, 14, 15) + - **Property 10: References in generated output use Terraform syntax** + - **Property 11: Generated HCL syntactic validity** + - **Property 12: File organization by resource type** + - **Property 13: Variable extraction for shared values** + - **Property 14: Identifier sanitization validity** + - **Property 15: Traceability comments in generated code** + - **Validates: Requirements 2.2, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6** + +- [ ] 6. Implement State Builder + - [ ] 6.1 Implement Terraform state file generation (format v4) + - Create `StateBuilder` class + - Generate state JSON with version=4, unique UUID lineage, serial number + - Create state entries binding each resource block to its live infrastructure ID + - Populate full attribute sets from discovery data + - Set schema_version matching provider version from scan profile + - Mark sensitive attributes per provider schema + - Include dependency references in state entries + - _Requirements: 4.1, 4.2, 4.4, 4.5_ + + - [ ] 6.2 Implement unmapped resource handling in state builder + - Log warnings for resources that cannot be mapped to state entries + - Handle missing provider-assigned resource identifiers + - Exclude unmapped resources from state file + - _Requirements: 4.3, 4.6_ + + - [ ]* 6.3 Write property tests for State Builder (Properties 16, 17) + - **Property 16: State file structural validity** + - **Property 17: State entry completeness and schema correctness** + - **Validates: Requirements 4.1, 4.2, 4.4, 4.5** + +- [ ] 7. Checkpoint - Ensure all tests pass + - Ensure all tests pass, ask the user if questions arise. + +- [ ] 8. Implement Validator + - [ ] 8.1 Implement Terraform validation runner + - Create `Validator` class + - Run `terraform init` and `terraform validate` against generated output + - Run `terraform plan` and check for zero planned changes + - Report validation errors with file name and error description + - Report drift: list each resource with planned change type (add, modify, destroy) + - Handle missing Terraform binary with descriptive error + - _Requirements: 7.1, 7.2, 7.3, 7.5_ + + - [ ] 8.2 Implement auto-correction loop for validation errors + - Attempt to correct validation errors (up to 3 attempts) + - Re-validate after each correction + - Report failure with remaining error details if corrections exhausted + - _Requirements: 7.4_ + + - [ ]* 8.3 Write property test for drift report correctness (Property 22) + - **Property 22: Drift report correctness** + - **Validates: Requirements 7.3** + +- [ ] 9. Implement Incremental Scan Engine + - [ ] 9.1 Implement scan snapshot storage and retrieval + - Store scan results as timestamped JSON in `.iac-reverse/snapshots/` + - Use profile_hash for matching scans to profiles + - Retain at least 2 most recent snapshots per profile + - Load previous snapshot for comparison + - _Requirements: 8.4, 8.6_ + + - [ ] 9.2 Implement change detection and classification + - Compare current scan against previous snapshot + - Classify resources as added, removed, or modified + - Produce change summary with counts and resource details + - Handle first scan (no previous) as full initial scan + - _Requirements: 8.1, 8.4, 8.5_ + + - [ ] 9.3 Implement incremental code and state updates + - Update only IaC files containing changed resources (not full regeneration) + - Remove resource blocks and state entries for removed resources + - Add/update blocks for added/modified resources + - _Requirements: 8.2, 8.3_ + + - [ ]* 9.4 Write property tests for Incremental Scan (Properties 23, 24, 25, 26) + - **Property 23: Change classification correctness** + - **Property 24: Incremental update scope** + - **Property 25: Removed resource exclusion** + - **Property 26: Snapshot retention** + - **Validates: Requirements 8.1, 8.2, 8.3, 8.5, 8.6** + +- [ ] 10. Implement CLI and wire pipeline together + - [ ] 10.1 Implement CLI entry point with Click + - Create `cli.py` with Click command group + - Implement `scan` command accepting scan profile YAML path + - Implement `generate` command to run full pipeline (scan → resolve → generate → state → validate) + - Implement `diff` command for incremental scanning + - Implement `validate` command for standalone validation + - Implement `login` command for Authentik SSO authentication + - Wire all pipeline components together in correct order + - Add progress bars and formatted output for scan progress + - _Requirements: 1.1, 1.5, 6.1, 6.2, 6.3, 6.4, 6.5_ + + - [ ] 10.2 Implement scan profile YAML loading and environment variable expansion + - Parse YAML scan profiles + - Expand `${ENV_VAR}` references in credential fields + - Support multi-profile YAML for multi-provider scans + - _Requirements: 6.1, 5.3_ + + - [ ]* 10.3 Write property tests for multi-provider and filtering (Properties 18, 19, 20, 21) + - **Property 18: Multi-provider merge with naming conflict resolution** + - **Property 19: Provider block generation** + - **Property 20: Scan profile validation completeness** (additional coverage) + - **Property 21: Filtering correctness** + - **Validates: Requirements 5.3, 5.4, 6.1, 6.2, 6.4, 6.6, 6.7** + +- [ ] 11. Implement resource type filter and multi-provider failure handling + - [ ] 11.1 Implement resource type filtering in scanner + - When filters specified, discover only listed resource types + - When no filters specified, discover all supported types for provider + - _Requirements: 6.2, 6.3_ + + - [ ] 11.2 Implement multi-provider partial failure handling + - Complete scanning for all remaining providers when one fails + - Include successfully discovered resources in inventory + - Report which providers failed with error details + - _Requirements: 5.5_ + +- [ ] 12. Final checkpoint - Ensure all tests pass + - Ensure all tests pass, ask the user if questions arise. + +## Notes + +- Tasks marked with `*` are optional and can be skipped for faster MVP +- Each task references specific requirements for traceability +- Checkpoints ensure incremental validation +- Property tests validate universal correctness properties from the design document +- Unit tests validate specific examples and edge cases +- The tool is Python-based using Hypothesis for property-based testing +- All provider plugins conform to the `ProviderPlugin` abstract interface +- Pipeline architecture ensures each component is independently testable +- Providers: Docker Swarm, Kubernetes, Synology, Harvester, Bare Metal, Windows, Authentik +- Platform categories: Container Orchestration, Storage Appliance, HCI, Bare Metal, Windows (no Hypervisor category) +- Windows discovery uses pywinrm/WMI for services, IIS, scheduled tasks, Hyper-V, and more + +## Task Dependency Graph + +```json +{ + "waves": [ + { "id": 0, "tasks": ["1.1"] }, + { "id": 1, "tasks": ["1.2"] }, + { "id": 2, "tasks": ["1.3", "2.1"] }, + { "id": 3, "tasks": ["1.4", "2.3", "2.4", "2.5", "2.6", "2.7", "2.8", "2.9"] }, + { "id": 4, "tasks": ["2.2", "2.10"] }, + { "id": 5, "tasks": ["4.1"] }, + { "id": 6, "tasks": ["4.2", "4.3"] }, + { "id": 7, "tasks": ["4.4", "5.1", "5.2"] }, + { "id": 8, "tasks": ["5.3", "5.4", "5.5"] }, + { "id": 9, "tasks": ["5.6", "6.1"] }, + { "id": 10, "tasks": ["6.2"] }, + { "id": 11, "tasks": ["6.3", "8.1"] }, + { "id": 12, "tasks": ["8.2"] }, + { "id": 13, "tasks": ["8.3", "9.1"] }, + { "id": 14, "tasks": ["9.2"] }, + { "id": 15, "tasks": ["9.3"] }, + { "id": 16, "tasks": ["9.4", "10.1", "11.1", "11.2"] }, + { "id": 17, "tasks": ["10.2"] }, + { "id": 18, "tasks": ["10.3"] } + ] +} +```