17 KiB
Implementation Plan: IaC Reverse Engineering
Overview
Build a Python CLI tool that reverse-engineers existing on-premises infrastructure into Terraform HCL code and state files. The tool follows a pipeline architecture (Scanner → Dependency Resolver → Code Generator → State Builder → Validator) with a provider plugin system for each on-premises platform (Docker Swarm, Kubernetes, Synology, Harvester, Bare Metal, Windows, Authentik).
Tasks
-
1. Set up project structure and core data models
-
1.1 Create project directory structure, pyproject.toml, and install dependencies
- Create
src/iac_reverse/package with__init__.py - Create subdirectories:
scanner/,resolver/,generator/,state_builder/,validator/,incremental/,auth/,cli/ - Set up
pyproject.tomlwith dependencies: kubernetes, docker, pywinrm, hypothesis, pytest, click, jinja2, networkx, pyyaml, python-synology - Create
tests/directory withunit/,property/,integration/subdirectories - Requirements: 1.1, 5.1, 5.2
- Create
-
1.2 Define core enums, data classes, and interfaces
- Implement
ProviderTypeenum (docker_swarm, kubernetes, synology, harvester, bare_metal, windows) - Implement
PlatformCategoryenum (container_orchestration, storage_appliance, hci, bare_metal, windows) andPROVIDER_PLATFORM_MAP - Implement
CpuArchitectureenum (amd64, arm, aarch64) - Implement
ScanProfile,DiscoveredResource,ScanResult,ScanProgressdataclasses - Implement
ResourceRelationship,DependencyGraph,UnresolvedReferencedataclasses - Implement
GeneratedFile,ExtractedVariable,CodeGenerationResultdataclasses - Implement
StateEntry,StateFiledataclasses - Implement
ValidationResult,PlannedChange,ValidationErrordataclasses - Implement
ChangeTypeenum andResourceChange,ChangeSummarydataclasses - Define
ProviderPluginabstract base class with all abstract methods - Requirements: 1.1, 1.2, 2.1, 3.1, 4.1, 5.1, 5.2, 8.1
- Implement
-
1.3 Implement ScanProfile validation logic
- Validate mandatory fields: provider type and non-empty credentials
- Validate optional fields: resource_type_filters max 200 entries, endpoints list
- Validate resource types against provider's supported types
- Return all validation errors in a single response
- Requirements: 6.1, 6.6, 6.7
-
1.4 Write property test for scan profile validation (Property 20)
- Property 20: Scan profile validation completeness
- Validates: Requirements 6.1, 6.6, 6.7
-
-
2. Implement Scanner core and provider plugin system
-
2.1 Implement Scanner orchestrator with progress reporting and error handling
- Create
Scannerclass that accepts aScanProfileand orchestrates discovery - Implement connection timeout (30 seconds) and authentication error handling with descriptive messages
- Implement progress callback invocation per resource type completion
- Implement retry logic: up to 3 retries with exponential backoff for transient errors
- Implement partial inventory return on connection loss
- Implement warning logging for unsupported resource types while continuing scan
- Requirements: 1.1, 1.3, 1.4, 1.5, 1.6, 1.7
- Create
-
2.2 Write property tests for Scanner behavior (Properties 2, 3, 4, 5)
- Property 2: Authentication error descriptiveness
- Property 3: Graceful degradation on unsupported resource types
- Property 4: Progress reporting frequency
- Property 5: Partial inventory preservation on failure
- Validates: Requirements 1.3, 1.4, 1.5, 1.7
-
2.3 Implement Docker Swarm provider plugin
- Implement
DockerSwarmPluginusing docker-sdk-python - Discover services, networks, volumes, configs, secrets (metadata only)
- Detect architecture from node info
- Requirements: 1.1, 1.2, 5.2
- Implement
-
2.4 Implement Kubernetes provider plugin
- Implement
KubernetesPluginusing kubernetes-client - Discover deployments, services, ingresses, config maps, persistent volumes, namespaces
- Detect architecture from node labels
- Requirements: 1.1, 1.2, 5.2
- Implement
-
2.5 Implement Synology provider plugin
- Implement
SynologyPluginusing Synology DSM API - Discover shared folders, volumes, storage pools, replication tasks, users
- Detect architecture from system info (ARM vs AMD64)
- Requirements: 1.1, 1.2, 5.2
- Implement
-
2.6 Implement Harvester provider plugin
- Implement
HarvesterPluginusing Harvester/K8s-based API - Discover VMs, volumes, images, networks (HCI combined resources)
- Detect architecture from node info
- Requirements: 1.1, 1.2, 5.2
- Implement
-
2.7 Implement Bare Metal provider plugin
- Implement
BareMetalPluginusing IPMI/Redfish API - Discover hardware inventory, BMC configs, network interfaces, RAID configurations
- Detect architecture from system hardware info
- Requirements: 1.1, 1.2, 5.2
- Implement
-
2.8 Implement Windows provider plugin
- Implement
WindowsDiscoveryPluginusing pywinrm library - Authenticate via WinRM using NTLM or Kerberos (configurable transport, port, SSL)
- Discover Windows services, scheduled tasks, IIS sites, IIS app pools, network adapters, firewall rules, installed software, Windows features, Hyper-V VMs, Hyper-V switches, DNS records, local users, local groups
- Detect CPU architecture via WMI Win32_Processor query
- Discover Hyper-V resources only if the Hyper-V role is installed; skip gracefully otherwise
- Handle WinRM-specific errors: WinRM not enabled, WMI query failure, insufficient privileges
- Requirements: 1.1, 1.2, 5.2
- Implement
-
2.9 Implement Authentik integration (SSO + discovery plugin)
- Implement
AuthentikAuthProviderfor OAuth2/OIDC SSO flow (authenticate, refresh, validate) - Implement
AuthentikDiscoveryPluginconforming toProviderPlugin - Discover flows, stages, providers, applications, outposts, property mappings, certificates, groups, sources
- Requirements: 1.1, 1.2, 5.2
- Implement
-
2.10 Write property test for resource inventory completeness (Property 1)
- Property 1: Resource inventory completeness
- Validates: Requirements 1.2
-
-
3. Checkpoint - Ensure all tests pass
- Ensure all tests pass, ask the user if questions arise.
-
4. Implement Dependency Resolver
-
4.1 Implement dependency resolution and graph building
- Create
DependencyResolverclass - Analyze resource
raw_referencesto identify parent-child, reference, and dependency relationships - Build dependency graph using networkx
- Produce topological ordering of resources
- Represent relationships as explicit Terraform references (not hardcoded IDs)
- Requirements: 2.1, 2.2, 2.4
- Create
-
4.2 Implement cycle detection and resolution suggestions
- Detect circular dependencies in the graph
- Report cycles listing all involved resources
- Suggest resolution strategies (which relationship to break, data source lookup alternatives)
- Requirements: 2.3
-
4.3 Implement unresolved reference handling
- Identify references to IDs not in the current inventory
- Log warnings for unresolved references
- Represent unresolved references as data source lookups or variables in output
- Requirements: 2.5
-
4.4 Write property tests for Dependency Resolver (Properties 6, 7, 8, 9)
- Property 6: Dependency relationship identification
- Property 7: Cycle detection correctness
- Property 8: Topological order validity
- Property 9: Unresolved references become data sources or variables
- Validates: Requirements 2.1, 2.3, 2.4, 2.5
-
-
5. Implement Code Generator
-
5.1 Implement HCL code generation with Jinja2 templates
- Create
CodeGeneratorclass - Create Jinja2 templates for Terraform resource blocks per provider/resource type
- Generate syntactically valid HCL files from dependency graph
- Organize output: one
.tffile per resource type - Include traceability comments with original resource unique_id
- Use Terraform resource references for inter-resource dependencies (not hardcoded IDs)
- Generate architecture-specific tags/labels on resources
- Requirements: 3.1, 3.2, 3.5, 3.6
- Create
-
5.2 Implement identifier sanitization
- Create
sanitize_identifier()function - Convert resource names to valid Terraform identifiers:
^[a-zA-Z_][a-zA-Z0-9_]*$ - Handle special characters, unicode, leading digits, spaces by replacing with underscores
- Ensure non-empty output for any input
- Requirements: 3.4
- Create
-
5.3 Implement variable extraction logic
- Identify attribute values appearing in 2+ resources
- Extract shared values into
variables.tfwith defaults set to most common value - Generate variable declarations with type expressions and descriptions
- Requirements: 3.3
-
5.4 Implement provider configuration block generation
- Generate separate provider blocks for each distinct provider used
- Include platform-specific configuration (endpoints, certificate settings)
- Requirements: 5.4
-
5.5 Implement multi-provider resource merging with conflict resolution
- Merge resources from multiple scan profiles into unified inventory
- Resolve naming conflicts by prefixing with provider identifier
- Preserve provider-specific attributes
- Requirements: 5.3
-
5.6 Write property tests for Code Generator (Properties 10, 11, 12, 13, 14, 15)
- Property 10: References in generated output use Terraform syntax
- Property 11: Generated HCL syntactic validity
- Property 12: File organization by resource type
- Property 13: Variable extraction for shared values
- Property 14: Identifier sanitization validity
- Property 15: Traceability comments in generated code
- Validates: Requirements 2.2, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6
-
-
6. Implement State Builder
-
6.1 Implement Terraform state file generation (format v4)
- Create
StateBuilderclass - Generate state JSON with version=4, unique UUID lineage, serial number
- Create state entries binding each resource block to its live infrastructure ID
- Populate full attribute sets from discovery data
- Set schema_version matching provider version from scan profile
- Mark sensitive attributes per provider schema
- Include dependency references in state entries
- Requirements: 4.1, 4.2, 4.4, 4.5
- Create
-
6.2 Implement unmapped resource handling in state builder
- Log warnings for resources that cannot be mapped to state entries
- Handle missing provider-assigned resource identifiers
- Exclude unmapped resources from state file
- Requirements: 4.3, 4.6
-
6.3 Write property tests for State Builder (Properties 16, 17)
- Property 16: State file structural validity
- Property 17: State entry completeness and schema correctness
- Validates: Requirements 4.1, 4.2, 4.4, 4.5
-
-
7. Checkpoint - Ensure all tests pass
- Ensure all tests pass, ask the user if questions arise.
-
8. Implement Validator
-
8.1 Implement Terraform validation runner
- Create
Validatorclass - Run
terraform initandterraform validateagainst generated output - Run
terraform planand check for zero planned changes - Report validation errors with file name and error description
- Report drift: list each resource with planned change type (add, modify, destroy)
- Handle missing Terraform binary with descriptive error
- Requirements: 7.1, 7.2, 7.3, 7.5
- Create
-
8.2 Implement auto-correction loop for validation errors
- Attempt to correct validation errors (up to 3 attempts)
- Re-validate after each correction
- Report failure with remaining error details if corrections exhausted
- Requirements: 7.4
-
8.3 Write property test for drift report correctness (Property 22)
- Property 22: Drift report correctness
- Validates: Requirements 7.3
-
-
9. Implement Incremental Scan Engine
-
9.1 Implement scan snapshot storage and retrieval
- Store scan results as timestamped JSON in
.iac-reverse/snapshots/ - Use profile_hash for matching scans to profiles
- Retain at least 2 most recent snapshots per profile
- Load previous snapshot for comparison
- Requirements: 8.4, 8.6
- Store scan results as timestamped JSON in
-
9.2 Implement change detection and classification
- Compare current scan against previous snapshot
- Classify resources as added, removed, or modified
- Produce change summary with counts and resource details
- Handle first scan (no previous) as full initial scan
- Requirements: 8.1, 8.4, 8.5
-
9.3 Implement incremental code and state updates
- Update only IaC files containing changed resources (not full regeneration)
- Remove resource blocks and state entries for removed resources
- Add/update blocks for added/modified resources
- Requirements: 8.2, 8.3
-
9.4 Write property tests for Incremental Scan (Properties 23, 24, 25, 26)
- Property 23: Change classification correctness
- Property 24: Incremental update scope
- Property 25: Removed resource exclusion
- Property 26: Snapshot retention
- Validates: Requirements 8.1, 8.2, 8.3, 8.5, 8.6
-
-
10. Implement CLI and wire pipeline together
-
10.1 Implement CLI entry point with Click
- Create
cli.pywith Click command group - Implement
scancommand accepting scan profile YAML path - Implement
generatecommand to run full pipeline (scan → resolve → generate → state → validate) - Implement
diffcommand for incremental scanning - Implement
validatecommand for standalone validation - Implement
logincommand for Authentik SSO authentication - Wire all pipeline components together in correct order
- Add progress bars and formatted output for scan progress
- Requirements: 1.1, 1.5, 6.1, 6.2, 6.3, 6.4, 6.5
- Create
-
10.2 Implement scan profile YAML loading and environment variable expansion
- Parse YAML scan profiles
- Expand
${ENV_VAR}references in credential fields - Support multi-profile YAML for multi-provider scans
- Requirements: 6.1, 5.3
-
10.3 Write property tests for multi-provider and filtering (Properties 18, 19, 20, 21)
- Property 18: Multi-provider merge with naming conflict resolution
- Property 19: Provider block generation
- Property 20: Scan profile validation completeness (additional coverage)
- Property 21: Filtering correctness
- Validates: Requirements 5.3, 5.4, 6.1, 6.2, 6.4, 6.6, 6.7
-
-
11. Implement resource type filter and multi-provider failure handling
-
11.1 Implement resource type filtering in scanner
- When filters specified, discover only listed resource types
- When no filters specified, discover all supported types for provider
- Requirements: 6.2, 6.3
-
11.2 Implement multi-provider partial failure handling
- Complete scanning for all remaining providers when one fails
- Include successfully discovered resources in inventory
- Report which providers failed with error details
- Requirements: 5.5
-
-
12. Final checkpoint - Ensure all tests pass
- Ensure all tests pass, ask the user if questions arise.
Notes
- Tasks marked with
*are optional and can be skipped for faster MVP - Each task references specific requirements for traceability
- Checkpoints ensure incremental validation
- Property tests validate universal correctness properties from the design document
- Unit tests validate specific examples and edge cases
- The tool is Python-based using Hypothesis for property-based testing
- All provider plugins conform to the
ProviderPluginabstract interface - Pipeline architecture ensures each component is independently testable
- Providers: Docker Swarm, Kubernetes, Synology, Harvester, Bare Metal, Windows, Authentik
- Platform categories: Container Orchestration, Storage Appliance, HCI, Bare Metal, Windows (no Hypervisor category)
- Windows discovery uses pywinrm/WMI for services, IIS, scheduled tasks, Hyper-V, and more
Task Dependency Graph
{
"waves": [
{ "id": 0, "tasks": ["1.1"] },
{ "id": 1, "tasks": ["1.2"] },
{ "id": 2, "tasks": ["1.3", "2.1"] },
{ "id": 3, "tasks": ["1.4", "2.3", "2.4", "2.5", "2.6", "2.7", "2.8", "2.9"] },
{ "id": 4, "tasks": ["2.2", "2.10"] },
{ "id": 5, "tasks": ["4.1"] },
{ "id": 6, "tasks": ["4.2", "4.3"] },
{ "id": 7, "tasks": ["4.4", "5.1", "5.2"] },
{ "id": 8, "tasks": ["5.3", "5.4", "5.5"] },
{ "id": 9, "tasks": ["5.6", "6.1"] },
{ "id": 10, "tasks": ["6.2"] },
{ "id": 11, "tasks": ["6.3", "8.1"] },
{ "id": 12, "tasks": ["8.2"] },
{ "id": 13, "tasks": ["8.3", "9.1"] },
{ "id": 14, "tasks": ["9.2"] },
{ "id": 15, "tasks": ["9.3"] },
{ "id": 16, "tasks": ["9.4", "10.1", "11.1", "11.2"] },
{ "id": 17, "tasks": ["10.2"] },
{ "id": 18, "tasks": ["10.3"] }
]
}