Files
2026-05-22 00:19:30 -04:00

17 KiB

Implementation Plan: IaC Reverse Engineering

Overview

Build a Python CLI tool that reverse-engineers existing on-premises infrastructure into Terraform HCL code and state files. The tool follows a pipeline architecture (Scanner → Dependency Resolver → Code Generator → State Builder → Validator) with a provider plugin system for each on-premises platform (Docker Swarm, Kubernetes, Synology, Harvester, Bare Metal, Windows, Authentik).

Tasks

  • 1. Set up project structure and core data models

    • 1.1 Create project directory structure, pyproject.toml, and install dependencies

      • Create src/iac_reverse/ package with __init__.py
      • Create subdirectories: scanner/, resolver/, generator/, state_builder/, validator/, incremental/, auth/, cli/
      • Set up pyproject.toml with dependencies: kubernetes, docker, pywinrm, hypothesis, pytest, click, jinja2, networkx, pyyaml, python-synology
      • Create tests/ directory with unit/, property/, integration/ subdirectories
      • Requirements: 1.1, 5.1, 5.2
    • 1.2 Define core enums, data classes, and interfaces

      • Implement ProviderType enum (docker_swarm, kubernetes, synology, harvester, bare_metal, windows)
      • Implement PlatformCategory enum (container_orchestration, storage_appliance, hci, bare_metal, windows) and PROVIDER_PLATFORM_MAP
      • Implement CpuArchitecture enum (amd64, arm, aarch64)
      • Implement ScanProfile, DiscoveredResource, ScanResult, ScanProgress dataclasses
      • Implement ResourceRelationship, DependencyGraph, UnresolvedReference dataclasses
      • Implement GeneratedFile, ExtractedVariable, CodeGenerationResult dataclasses
      • Implement StateEntry, StateFile dataclasses
      • Implement ValidationResult, PlannedChange, ValidationError dataclasses
      • Implement ChangeType enum and ResourceChange, ChangeSummary dataclasses
      • Define ProviderPlugin abstract base class with all abstract methods
      • Requirements: 1.1, 1.2, 2.1, 3.1, 4.1, 5.1, 5.2, 8.1
    • 1.3 Implement ScanProfile validation logic

      • Validate mandatory fields: provider type and non-empty credentials
      • Validate optional fields: resource_type_filters max 200 entries, endpoints list
      • Validate resource types against provider's supported types
      • Return all validation errors in a single response
      • Requirements: 6.1, 6.6, 6.7
    • 1.4 Write property test for scan profile validation (Property 20)

      • Property 20: Scan profile validation completeness
      • Validates: Requirements 6.1, 6.6, 6.7
  • 2. Implement Scanner core and provider plugin system

    • 2.1 Implement Scanner orchestrator with progress reporting and error handling

      • Create Scanner class that accepts a ScanProfile and orchestrates discovery
      • Implement connection timeout (30 seconds) and authentication error handling with descriptive messages
      • Implement progress callback invocation per resource type completion
      • Implement retry logic: up to 3 retries with exponential backoff for transient errors
      • Implement partial inventory return on connection loss
      • Implement warning logging for unsupported resource types while continuing scan
      • Requirements: 1.1, 1.3, 1.4, 1.5, 1.6, 1.7
    • 2.2 Write property tests for Scanner behavior (Properties 2, 3, 4, 5)

      • Property 2: Authentication error descriptiveness
      • Property 3: Graceful degradation on unsupported resource types
      • Property 4: Progress reporting frequency
      • Property 5: Partial inventory preservation on failure
      • Validates: Requirements 1.3, 1.4, 1.5, 1.7
    • 2.3 Implement Docker Swarm provider plugin

      • Implement DockerSwarmPlugin using docker-sdk-python
      • Discover services, networks, volumes, configs, secrets (metadata only)
      • Detect architecture from node info
      • Requirements: 1.1, 1.2, 5.2
    • 2.4 Implement Kubernetes provider plugin

      • Implement KubernetesPlugin using kubernetes-client
      • Discover deployments, services, ingresses, config maps, persistent volumes, namespaces
      • Detect architecture from node labels
      • Requirements: 1.1, 1.2, 5.2
    • 2.5 Implement Synology provider plugin

      • Implement SynologyPlugin using Synology DSM API
      • Discover shared folders, volumes, storage pools, replication tasks, users
      • Detect architecture from system info (ARM vs AMD64)
      • Requirements: 1.1, 1.2, 5.2
    • 2.6 Implement Harvester provider plugin

      • Implement HarvesterPlugin using Harvester/K8s-based API
      • Discover VMs, volumes, images, networks (HCI combined resources)
      • Detect architecture from node info
      • Requirements: 1.1, 1.2, 5.2
    • 2.7 Implement Bare Metal provider plugin

      • Implement BareMetalPlugin using IPMI/Redfish API
      • Discover hardware inventory, BMC configs, network interfaces, RAID configurations
      • Detect architecture from system hardware info
      • Requirements: 1.1, 1.2, 5.2
    • 2.8 Implement Windows provider plugin

      • Implement WindowsDiscoveryPlugin using pywinrm library
      • Authenticate via WinRM using NTLM or Kerberos (configurable transport, port, SSL)
      • Discover Windows services, scheduled tasks, IIS sites, IIS app pools, network adapters, firewall rules, installed software, Windows features, Hyper-V VMs, Hyper-V switches, DNS records, local users, local groups
      • Detect CPU architecture via WMI Win32_Processor query
      • Discover Hyper-V resources only if the Hyper-V role is installed; skip gracefully otherwise
      • Handle WinRM-specific errors: WinRM not enabled, WMI query failure, insufficient privileges
      • Requirements: 1.1, 1.2, 5.2
    • 2.9 Implement Authentik integration (SSO + discovery plugin)

      • Implement AuthentikAuthProvider for OAuth2/OIDC SSO flow (authenticate, refresh, validate)
      • Implement AuthentikDiscoveryPlugin conforming to ProviderPlugin
      • Discover flows, stages, providers, applications, outposts, property mappings, certificates, groups, sources
      • Requirements: 1.1, 1.2, 5.2
    • 2.10 Write property test for resource inventory completeness (Property 1)

      • Property 1: Resource inventory completeness
      • Validates: Requirements 1.2
  • 3. Checkpoint - Ensure all tests pass

    • Ensure all tests pass, ask the user if questions arise.
  • 4. Implement Dependency Resolver

    • 4.1 Implement dependency resolution and graph building

      • Create DependencyResolver class
      • Analyze resource raw_references to identify parent-child, reference, and dependency relationships
      • Build dependency graph using networkx
      • Produce topological ordering of resources
      • Represent relationships as explicit Terraform references (not hardcoded IDs)
      • Requirements: 2.1, 2.2, 2.4
    • 4.2 Implement cycle detection and resolution suggestions

      • Detect circular dependencies in the graph
      • Report cycles listing all involved resources
      • Suggest resolution strategies (which relationship to break, data source lookup alternatives)
      • Requirements: 2.3
    • 4.3 Implement unresolved reference handling

      • Identify references to IDs not in the current inventory
      • Log warnings for unresolved references
      • Represent unresolved references as data source lookups or variables in output
      • Requirements: 2.5
    • 4.4 Write property tests for Dependency Resolver (Properties 6, 7, 8, 9)

      • Property 6: Dependency relationship identification
      • Property 7: Cycle detection correctness
      • Property 8: Topological order validity
      • Property 9: Unresolved references become data sources or variables
      • Validates: Requirements 2.1, 2.3, 2.4, 2.5
  • 5. Implement Code Generator

    • 5.1 Implement HCL code generation with Jinja2 templates

      • Create CodeGenerator class
      • Create Jinja2 templates for Terraform resource blocks per provider/resource type
      • Generate syntactically valid HCL files from dependency graph
      • Organize output: one .tf file per resource type
      • Include traceability comments with original resource unique_id
      • Use Terraform resource references for inter-resource dependencies (not hardcoded IDs)
      • Generate architecture-specific tags/labels on resources
      • Requirements: 3.1, 3.2, 3.5, 3.6
    • 5.2 Implement identifier sanitization

      • Create sanitize_identifier() function
      • Convert resource names to valid Terraform identifiers: ^[a-zA-Z_][a-zA-Z0-9_]*$
      • Handle special characters, unicode, leading digits, spaces by replacing with underscores
      • Ensure non-empty output for any input
      • Requirements: 3.4
    • 5.3 Implement variable extraction logic

      • Identify attribute values appearing in 2+ resources
      • Extract shared values into variables.tf with defaults set to most common value
      • Generate variable declarations with type expressions and descriptions
      • Requirements: 3.3
    • 5.4 Implement provider configuration block generation

      • Generate separate provider blocks for each distinct provider used
      • Include platform-specific configuration (endpoints, certificate settings)
      • Requirements: 5.4
    • 5.5 Implement multi-provider resource merging with conflict resolution

      • Merge resources from multiple scan profiles into unified inventory
      • Resolve naming conflicts by prefixing with provider identifier
      • Preserve provider-specific attributes
      • Requirements: 5.3
    • 5.6 Write property tests for Code Generator (Properties 10, 11, 12, 13, 14, 15)

      • Property 10: References in generated output use Terraform syntax
      • Property 11: Generated HCL syntactic validity
      • Property 12: File organization by resource type
      • Property 13: Variable extraction for shared values
      • Property 14: Identifier sanitization validity
      • Property 15: Traceability comments in generated code
      • Validates: Requirements 2.2, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6
  • 6. Implement State Builder

    • 6.1 Implement Terraform state file generation (format v4)

      • Create StateBuilder class
      • Generate state JSON with version=4, unique UUID lineage, serial number
      • Create state entries binding each resource block to its live infrastructure ID
      • Populate full attribute sets from discovery data
      • Set schema_version matching provider version from scan profile
      • Mark sensitive attributes per provider schema
      • Include dependency references in state entries
      • Requirements: 4.1, 4.2, 4.4, 4.5
    • 6.2 Implement unmapped resource handling in state builder

      • Log warnings for resources that cannot be mapped to state entries
      • Handle missing provider-assigned resource identifiers
      • Exclude unmapped resources from state file
      • Requirements: 4.3, 4.6
    • 6.3 Write property tests for State Builder (Properties 16, 17)

      • Property 16: State file structural validity
      • Property 17: State entry completeness and schema correctness
      • Validates: Requirements 4.1, 4.2, 4.4, 4.5
  • 7. Checkpoint - Ensure all tests pass

    • Ensure all tests pass, ask the user if questions arise.
  • 8. Implement Validator

    • 8.1 Implement Terraform validation runner

      • Create Validator class
      • Run terraform init and terraform validate against generated output
      • Run terraform plan and check for zero planned changes
      • Report validation errors with file name and error description
      • Report drift: list each resource with planned change type (add, modify, destroy)
      • Handle missing Terraform binary with descriptive error
      • Requirements: 7.1, 7.2, 7.3, 7.5
    • 8.2 Implement auto-correction loop for validation errors

      • Attempt to correct validation errors (up to 3 attempts)
      • Re-validate after each correction
      • Report failure with remaining error details if corrections exhausted
      • Requirements: 7.4
    • 8.3 Write property test for drift report correctness (Property 22)

      • Property 22: Drift report correctness
      • Validates: Requirements 7.3
  • 9. Implement Incremental Scan Engine

    • 9.1 Implement scan snapshot storage and retrieval

      • Store scan results as timestamped JSON in .iac-reverse/snapshots/
      • Use profile_hash for matching scans to profiles
      • Retain at least 2 most recent snapshots per profile
      • Load previous snapshot for comparison
      • Requirements: 8.4, 8.6
    • 9.2 Implement change detection and classification

      • Compare current scan against previous snapshot
      • Classify resources as added, removed, or modified
      • Produce change summary with counts and resource details
      • Handle first scan (no previous) as full initial scan
      • Requirements: 8.1, 8.4, 8.5
    • 9.3 Implement incremental code and state updates

      • Update only IaC files containing changed resources (not full regeneration)
      • Remove resource blocks and state entries for removed resources
      • Add/update blocks for added/modified resources
      • Requirements: 8.2, 8.3
    • 9.4 Write property tests for Incremental Scan (Properties 23, 24, 25, 26)

      • Property 23: Change classification correctness
      • Property 24: Incremental update scope
      • Property 25: Removed resource exclusion
      • Property 26: Snapshot retention
      • Validates: Requirements 8.1, 8.2, 8.3, 8.5, 8.6
  • 10. Implement CLI and wire pipeline together

    • 10.1 Implement CLI entry point with Click

      • Create cli.py with Click command group
      • Implement scan command accepting scan profile YAML path
      • Implement generate command to run full pipeline (scan → resolve → generate → state → validate)
      • Implement diff command for incremental scanning
      • Implement validate command for standalone validation
      • Implement login command for Authentik SSO authentication
      • Wire all pipeline components together in correct order
      • Add progress bars and formatted output for scan progress
      • Requirements: 1.1, 1.5, 6.1, 6.2, 6.3, 6.4, 6.5
    • 10.2 Implement scan profile YAML loading and environment variable expansion

      • Parse YAML scan profiles
      • Expand ${ENV_VAR} references in credential fields
      • Support multi-profile YAML for multi-provider scans
      • Requirements: 6.1, 5.3
    • 10.3 Write property tests for multi-provider and filtering (Properties 18, 19, 20, 21)

      • Property 18: Multi-provider merge with naming conflict resolution
      • Property 19: Provider block generation
      • Property 20: Scan profile validation completeness (additional coverage)
      • Property 21: Filtering correctness
      • Validates: Requirements 5.3, 5.4, 6.1, 6.2, 6.4, 6.6, 6.7
  • 11. Implement resource type filter and multi-provider failure handling

    • 11.1 Implement resource type filtering in scanner

      • When filters specified, discover only listed resource types
      • When no filters specified, discover all supported types for provider
      • Requirements: 6.2, 6.3
    • 11.2 Implement multi-provider partial failure handling

      • Complete scanning for all remaining providers when one fails
      • Include successfully discovered resources in inventory
      • Report which providers failed with error details
      • Requirements: 5.5
  • 12. Final checkpoint - Ensure all tests pass

    • Ensure all tests pass, ask the user if questions arise.

Notes

  • Tasks marked with * are optional and can be skipped for faster MVP
  • Each task references specific requirements for traceability
  • Checkpoints ensure incremental validation
  • Property tests validate universal correctness properties from the design document
  • Unit tests validate specific examples and edge cases
  • The tool is Python-based using Hypothesis for property-based testing
  • All provider plugins conform to the ProviderPlugin abstract interface
  • Pipeline architecture ensures each component is independently testable
  • Providers: Docker Swarm, Kubernetes, Synology, Harvester, Bare Metal, Windows, Authentik
  • Platform categories: Container Orchestration, Storage Appliance, HCI, Bare Metal, Windows (no Hypervisor category)
  • Windows discovery uses pywinrm/WMI for services, IIS, scheduled tasks, Hyper-V, and more

Task Dependency Graph

{
  "waves": [
    { "id": 0, "tasks": ["1.1"] },
    { "id": 1, "tasks": ["1.2"] },
    { "id": 2, "tasks": ["1.3", "2.1"] },
    { "id": 3, "tasks": ["1.4", "2.3", "2.4", "2.5", "2.6", "2.7", "2.8", "2.9"] },
    { "id": 4, "tasks": ["2.2", "2.10"] },
    { "id": 5, "tasks": ["4.1"] },
    { "id": 6, "tasks": ["4.2", "4.3"] },
    { "id": 7, "tasks": ["4.4", "5.1", "5.2"] },
    { "id": 8, "tasks": ["5.3", "5.4", "5.5"] },
    { "id": 9, "tasks": ["5.6", "6.1"] },
    { "id": 10, "tasks": ["6.2"] },
    { "id": 11, "tasks": ["6.3", "8.1"] },
    { "id": 12, "tasks": ["8.2"] },
    { "id": 13, "tasks": ["8.3", "9.1"] },
    { "id": 14, "tasks": ["9.2"] },
    { "id": 15, "tasks": ["9.3"] },
    { "id": 16, "tasks": ["9.4", "10.1", "11.1", "11.2"] },
    { "id": 17, "tasks": ["10.2"] },
    { "id": 18, "tasks": ["10.3"] }
  ]
}