Application ArchitectureΒΆ

Version: 1.0.0
Last Updated: 2025-01-31
Status: Production Ready


Executive SummaryΒΆ

VPC Provisioner is an enterprise-grade AWS infrastructure automation tool that provisions secure, isolated Virtual Private Cloud (VPC) environments through a configuration-driven approach. Built for cloud infrastructure teams, the tool reduces VPC deployment time from hours to minutes while enforcing AWS Well-Architected Framework best practices for network security, segmentation, and compliance.

Key Capabilities:

  • Automated VPC provisioning with three-tier network architecture (public, private, database subnets)

  • Multi-region and multi-account support with standardized naming conventions

  • AWS Marketplace integration with license validation

  • Comprehensive audit trails and HTML reporting

  • Docker containerization for consistent deployment

  • Cython-compiled core modules for code protection


Table of ContentsΒΆ

  1. Introduction & Context

  2. Architectural Representation

  3. Technical Strategy & Decisions

  4. Component Architecture

  5. Data Architecture

  6. Security Architecture

  7. Deployment Architecture

  8. Quality Attributes

  9. Integration Architecture

  10. Operational Architecture


1. Introduction & ContextΒΆ

1.1 Purpose & ScopeΒΆ

Purpose: Automate the provisioning of AWS Virtual Private Cloud (VPC) infrastructure for cloud workloads, providing isolated, secure network environments with standardized configurations across multiple AWS accounts, regions, and environments.

Scope:

  • VPC creation with configurable CIDR blocks

  • Three-tier subnet architecture (public, private, database)

  • Internet Gateway and NAT Gateway provisioning

  • Route table and Network ACL configuration

  • CloudFormation-based infrastructure as code

  • IAM policy generation for least-privilege access

  • HTML report generation for deployment documentation

  • AWS Marketplace license validation

Out of Scope:

  • Application deployment within VPCs

  • Security group rule management (handled by application teams)

  • VPC peering or Transit Gateway configuration

  • Direct Connect or VPN setup

1.2 Business GoalsΒΆ

Operational Efficiency:

  • Reduce VPC provisioning time from 2-4 hours (manual) to 3-5 minutes (automated)

  • Eliminate human error in network configuration

  • Enable self-service infrastructure provisioning for development teams

Security & Compliance:

  • Enforce network segmentation best practices by design

  • Ensure consistent security posture across all environments

  • Provide audit trails for compliance requirements (SOC 2, ISO 27001, HIPAA)

  • Implement least-privilege IAM policies automatically

Cost Optimization:

  • Configurable high-availability NAT Gateway (balance cost vs. availability)

  • Efficient CIDR block allocation to minimize IP address waste

  • Resource tagging for accurate cost allocation and chargeback

Standardization:

  • Consistent network architecture across all AWS accounts and regions

  • Reusable configuration templates for common deployment patterns

  • Version-controlled infrastructure definitions

Multi-Region Strategy:

  • Rapid deployment of consistent network infrastructure for disaster recovery

  • Geographic distribution for latency optimization

  • Compliance with data residency requirements

1.3 StakeholdersΒΆ

Primary Users:

  • Cloud Engineers: Design and deploy VPC infrastructure for applications

  • DevOps Engineers: Integrate VPC provisioning into CI/CD pipelines

  • Platform Engineers: Maintain standardized network blueprints

  • Infrastructure Architects: Define network topology and security policies

Secondary Users:

  • Security Teams: Audit network configurations and enforce policies

  • Compliance Officers: Verify adherence to regulatory requirements

  • Application Teams: Consume VPC infrastructure for workload deployment

  • Finance Teams: Track infrastructure costs via resource tagging

Executive Stakeholders:

  • CTO/VP Engineering: Oversee infrastructure automation strategy

  • CISO: Ensure security and compliance posture

  • CFO: Monitor infrastructure costs and ROI

1.4 Success CriteriaΒΆ

Functional:

  • βœ… Provision VPC with all required components in < 5 minutes

  • βœ… Support multiple AWS regions and accounts

  • βœ… Generate valid CloudFormation templates

  • βœ… Validate configurations before deployment

  • βœ… Provide comprehensive deployment reports

Non-Functional:

  • βœ… 99.9% success rate for VPC provisioning operations

  • βœ… Zero security vulnerabilities in application code

  • βœ… Complete audit trail for all operations

  • βœ… Support for 100+ concurrent VPC deployments

  • βœ… Documentation coverage > 90%


2. Architectural RepresentationΒΆ

2.1 System Context DiagramΒΆ

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        External Context                           β”‚
β”‚                                                                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                                 β”‚
β”‚  β”‚ Cloud/DevOps β”‚                                                 β”‚
β”‚  β”‚  Engineer    β”‚                                                 β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜                                                 β”‚
β”‚         β”‚ YAML Config                                             β”‚
β”‚         β”‚ CLI Commands                                            β”‚
β”‚         β–Ό                                                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚  β”‚           VPC Provisioner Application                   β”‚      β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚      β”‚
β”‚  β”‚  β”‚  β€’ Configuration Validation                        β”‚ β”‚      β”‚
β”‚  β”‚  β”‚  β€’ CloudFormation Template Generation              β”‚ β”‚      β”‚
β”‚  β”‚  β”‚  β€’ AWS API Integration (boto3)                     β”‚ β”‚      β”‚
β”‚  β”‚  β”‚  β€’ License Validation (AWS Marketplace)            β”‚ β”‚      β”‚
β”‚  β”‚  β”‚  β€’ HTML Report Generation                          β”‚ β”‚      β”‚
β”‚  β”‚  β”‚  β€’ Audit Logging                                   β”‚ β”‚      β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚      β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”‚                    β”‚ AWS API Calls (HTTPS/TLS)                    β”‚
β”‚                    β–Ό                                              β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚              Amazon Web Services (AWS)                      β”‚  β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚  β”‚
β”‚  β”‚  β”‚ CloudFormationβ”‚  β”‚   Amazon VPC β”‚  β”‚  AWS Marketplace β”‚  β”‚  β”‚
β”‚  β”‚  β”‚   Service     β”‚  β”‚   Service    β”‚  β”‚  Metering API    β”‚  β”‚  β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                         β”‚  β”‚
β”‚  β”‚  β”‚     IAM      β”‚  β”‚  CloudWatch  β”‚                         β”‚  β”‚
β”‚  β”‚  β”‚   Service    β”‚  β”‚    Logs      β”‚                         β”‚  β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                         β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                                                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Relationships:

  • User provides YAML configuration defining network requirements

  • Application validates configuration against JSON schema

  • Application generates CloudFormation template dynamically

  • Application calls AWS APIs to create/manage VPC resources

  • AWS CloudFormation orchestrates resource provisioning

  • Application validates AWS Marketplace license (if applicable)

  • Application generates HTML reports and audit logs

2.2 Container View (Deployment Units)ΒΆ

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Docker Container                            β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Python 3.13 Runtime Environment                         β”‚  β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚  β”‚
β”‚  β”‚  β”‚  VPC Provisioner Application                       β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β€’ CLI Interface (cli.py)                          β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β€’ Core Logic (.so compiled modules)               β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β€’ Configuration Loader                            β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β€’ License Validator                               β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β€’ HTML Generator                                  β”‚  β”‚  β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚  β”‚
β”‚  β”‚  β”‚  Dependencies (installed via uv)                   β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β€’ boto3 (AWS SDK)                                 β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β€’ PyYAML (config parsing)                         β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β€’ jsonschema (validation)                         β”‚  β”‚  β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Mounted Volumes                                         β”‚  β”‚
β”‚  β”‚  β€’ /app/configs (read-only) - YAML configurations        β”‚  β”‚
β”‚  β”‚  β€’ /app/policies (read-write) - Generated IAM policies   β”‚  β”‚
β”‚  β”‚  β€’ /app/reports (read-write) - Logs and HTML reports     β”‚  β”‚
β”‚  β”‚  β€’ ~/.aws (read-only) - AWS credentials                  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                                                β”‚
β”‚  Non-root user: vpcuser (UID 1000)                             β”‚
β”‚  Base image: python:3.13-slim (Debian)                         β”‚
β”‚  Health check: Python import validation                        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Container Characteristics:

  • Isolation: Runs as non-root user (vpcuser) with restricted permissions

  • Portability: Consistent execution environment across development, staging, production

  • Security: No hardcoded credentials, read-only credential mounts

  • Observability: Health checks for container monitoring

  • Immutability: Application code compiled to .so files, preventing tampering

2.3 Component View (Internal Modules)ΒΆ

vpc_provisioner/
β”œβ”€β”€ cli.py                    # Command-line interface entry point
β”œβ”€β”€ __main__.py               # Python module execution entry
β”‚
β”œβ”€β”€ config/
β”‚   β”œβ”€β”€ loader.py (.so)       # Configuration loading and validation
β”‚   └── app_config.yaml       # Application-level configuration
β”‚
β”œβ”€β”€ core/
β”‚   └── vpc_manager.py (.so)  # Core VPC provisioning logic
β”‚
β”œβ”€β”€ license/
β”‚   └── validator.py (.so)    # AWS Marketplace license validation
β”‚
└── utils/
    └── html_generator.py (.so) # HTML report generation

common/ (shared library)
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ config_loader.py      # YAML configuration parsing
β”‚   β”œβ”€β”€ aws_access.py         # AWS credential verification
β”‚   └── timing.py             # Performance timing utilities
β”‚
└── validators/
    └── schema_validator.py   # JSON schema validation

Module Responsibilities:

cli.py (Entry Point):

  • Parse command-line arguments (–config, –action)

  • Initialize logging configuration

  • Dispatch actions to VPC Manager

  • Handle top-level exception handling

  • Display user-friendly error messages

config/loader.py (Configuration Management):

  • Load YAML configuration files

  • Validate against JSON schema

  • Flatten nested configuration into attributes

  • Compute derived values (VPC name, stack name)

  • Handle configuration defaults

core/vpc_manager.py (Core Business Logic):

  • Orchestrate VPC provisioning workflow

  • Generate CloudFormation templates

  • Manage AWS CloudFormation stacks

  • Handle waiter configuration for async operations

  • Generate IAM policies

  • Create HTML deployment reports

  • Manage artifact paths (logs, templates, policies, reports)

license/validator.py (License Management):

  • Validate AWS Marketplace product code

  • Check license entitlements

  • Handle license expiration

  • Report license status

utils/html_generator.py (Reporting):

  • Generate HTML provisioning reports

  • Generate HTML deployment reports

  • Format CloudFormation template previews

  • Create visual network topology diagrams

  • Include metadata (timestamps, user, region)

common/utils/ (Shared Utilities):

  • config_loader: Reusable YAML parsing logic

  • aws_access: AWS credential verification and region validation

  • timing: Performance measurement decorators

common/validators/ (Shared Validation):

  • schema_validator: JSON schema validation engine


3. Technical Strategy & DecisionsΒΆ

3.1 Technology StackΒΆ

Core Technologies:

  • Python 3.13: Primary programming language

    • Modern async/await support

    • Type hints for code clarity

    • Comprehensive standard library

  • Cython: Compilation of core modules to .so files

    • Code protection (obfuscation)

    • Performance optimization

    • Intellectual property protection

  • boto3 1.42+: AWS SDK for Python

    • VPC and EC2 API operations

    • CloudFormation stack management

    • IAM policy operations

    • AWS Marketplace metering

  • uv: Modern Python package manager

    • Fast dependency resolution

    • Reproducible builds via uv.lock

    • Virtual environment management

Infrastructure Technologies:

  • AWS CloudFormation: Infrastructure as Code engine

    • Declarative resource definitions

    • Atomic operations with rollback

    • Dependency management

    • Change sets for preview

  • Amazon VPC: Virtual Private Cloud service

    • Network isolation

    • Subnet segmentation

    • Route table management

    • Gateway configuration

  • Docker: Containerization platform

    • Consistent runtime environment

    • Multi-stage builds for optimization

    • Non-root user execution

    • Health check integration

Configuration & Validation:

  • YAML: Human-readable configuration format

  • JSON Schema: Configuration validation

  • Jinja2: Template rendering (CloudFormation)

Development Tools:

  • Git: Version control

  • Make: Build automation

  • Trivy: Container security scanning

3.2 Architecture Decision Records (ADRs)ΒΆ

ADR-001: CloudFormation vs. Direct Boto3 Resource CreationΒΆ

Decision: Use AWS CloudFormation for VPC resource provisioning

Context: Need to provision multiple interdependent AWS resources (VPC, subnets, route tables, gateways) with proper dependency management and error handling.

Alternatives Considered:

  1. Direct boto3 API calls for each resource

  2. Terraform

  3. AWS CDK

  4. CloudFormation (chosen)

Rationale:

  • Atomic Operations: CloudFormation treats all resources as a single unit, ensuring all-or-nothing deployment

  • Automatic Rollback: Failed deployments automatically roll back to previous state

  • Dependency Management: CloudFormation handles resource dependencies automatically (e.g., VPC must exist before subnets)

  • State Management: CloudFormation maintains resource state, enabling updates and drift detection

  • Native AWS Integration: No third-party tools required, reducing operational complexity

  • Change Sets: Preview changes before applying them

Consequences:

  • βœ… Simplified error handling and recovery

  • βœ… Consistent resource state management

  • βœ… Built-in drift detection

  • ⚠️ CloudFormation-specific limitations (e.g., resource limits per stack)

  • ⚠️ Async operations require waiter pattern implementation

ADR-002: Configuration-Driven ArchitectureΒΆ

Decision: All VPC topology defined in YAML configuration files

Context: Need to support multiple clients, environments, regions, and network topologies without code changes.

Rationale:

  • Separation of Concerns: Network topology (data) separated from provisioning logic (code)

  • Version Control: Configurations can be versioned, reviewed, and audited in Git

  • Reusability: Common patterns can be templated and reused

  • Self-Service: Non-developers can modify network configurations

  • Validation: JSON schema ensures configuration correctness before deployment

Consequences:

  • βœ… Flexible and extensible without code changes

  • βœ… Configuration can be reviewed and approved separately

  • βœ… Easy to create environment-specific variations

  • ⚠️ Requires robust validation to prevent misconfigurations

ADR-003: Three-Tier Subnet ArchitectureΒΆ

Decision: Enforce three-tier subnet design (public, private, database)

Context: Need to provide secure network segmentation following AWS Well-Architected Framework.

Rationale:

  • Security Best Practice: Follows AWS recommended architecture for network segmentation

  • Defense in Depth: Multiple layers of network isolation

  • Compliance: Meets requirements for PCI-DSS, HIPAA, SOC 2

  • Flexibility: Supports various application architectures (web, app, data tiers)

Subnet Tiers:

  1. Public Subnets: Load balancers, bastion hosts, NAT gateways

  2. Private Subnets: Application servers, containers, compute workloads

  3. Database Subnets: RDS, ElastiCache, data stores (no internet access)

Consequences:

  • βœ… Strong security posture by default

  • βœ… Clear separation of concerns

  • βœ… Compliance-ready architecture

  • ⚠️ More complex than single-tier design

  • ⚠️ Requires proper route table configuration

ADR-004: Cython Compilation for Code ProtectionΒΆ

Decision: Compile core Python modules to .so files using Cython

Context: Need to protect intellectual property and prevent code tampering in commercial product.

Rationale:

  • Code Protection: .so files are binary, preventing easy reverse engineering

  • IP Protection: Core business logic not exposed as plain text

  • Tamper Resistance: Compiled code harder to modify

  • Performance: Potential performance improvements from C compilation

  • AWS Marketplace: Common practice for commercial AWS Marketplace products

Modules Compiled:

  • config/loader.py β†’ loader.cpython-313-x86_64-linux-gnu.so

  • core/vpc_manager.py β†’ vpc_manager.cpython-313-x86_64-linux-gnu.so

  • license/validator.py β†’ validator.cpython-313-x86_64-linux-gnu.so

  • utils/html_generator.py β†’ html_generator.cpython-313-x86_64-linux-gnu.so

Consequences:

  • βœ… Protected intellectual property

  • βœ… Harder to reverse engineer or tamper with

  • βœ… Professional commercial product appearance

  • ⚠️ Platform-specific binaries (Linux x86_64)

  • ⚠️ Debugging more difficult (no source line numbers)

ADR-005: Docker ContainerizationΒΆ

Decision: Distribute application as Docker container

Context: Need consistent execution environment across development, staging, production, and customer environments.

Rationale:

  • Consistency: Same environment everywhere (dependencies, Python version, OS)

  • Isolation: No conflicts with host system packages

  • Security: Non-root user execution, read-only mounts

  • Portability: Runs on any Docker-compatible platform

  • AWS Marketplace: Standard distribution method for containerized products

Container Design:

  • Multi-stage build (builder + runtime)

  • Minimal base image (python:3.13-slim)

  • Non-root user (vpcuser, UID 1000)

  • Health checks for monitoring

  • OCI metadata labels

Consequences:

  • βœ… Consistent execution environment

  • βœ… Simplified dependency management

  • βœ… Enhanced security through isolation

  • βœ… Easy to deploy and scale

  • ⚠️ Requires Docker runtime on host

ADR-006: High-Availability NAT Gateway (Optional)ΒΆ

Decision: Make HA NAT Gateway configuration optional

Context: NAT Gateways are expensive ($0.045/hour + data transfer). Not all workloads require HA.

Rationale:

  • Cost Optimization: Single NAT Gateway costs ~$32/month, HA costs ~$64-96/month (2-3 AZs)

  • Flexibility: Let customers choose based on workload criticality

  • Default: Standard (single NAT) for cost-conscious deployments

  • Option: HA (NAT per AZ) for production workloads requiring fault tolerance

Configuration:

nat_gateway:
  enabled: true
  high_availability: false  # true = NAT per AZ, false = single NAT

Consequences:

  • βœ… Cost-effective default configuration

  • βœ… Flexibility for different use cases

  • βœ… Clear trade-off between cost and availability

  • ⚠️ Single NAT Gateway is single point of failure (acceptable for dev/test)

ADR-007: Waiter Pattern for Async OperationsΒΆ

Decision: Implement explicit waiter configuration for CloudFormation operations

Context: CloudFormation stack operations are asynchronous and can take several minutes.

Rationale:

  • Predictable Timeouts: Explicit configuration prevents indefinite waiting

  • Error Handling: Proper timeout handling enables graceful failure

  • User Experience: Progress feedback during long-running operations

  • Reliability: Retry logic for transient failures

Configuration:

waiter_config = {
    'Delay': 10,        # Poll every 10 seconds
    'MaxAttempts': 60   # Maximum 10 minutes (60 * 10s)
}

Consequences:

  • βœ… Predictable operation timeouts

  • βœ… Better error messages for timeout scenarios

  • βœ… Configurable for different deployment sizes

  • ⚠️ Requires tuning for large VPC deployments

ADR-008: Naming Convention with Tenant IDΒΆ

Decision: Use tenant id instead of account ID in resource names

Context: Account IDs (12 digits) are long and not human-readable. Need shorter, meaningful identifiers.

Naming Pattern:

{company_prefix}-{environment}-{tenant_id}-{region}-vpc
Example: edge-prod-a001-us-west-2-vpc

Rationale:

  • Readability: a001 more readable than 123456789012

  • Brevity: Shorter names in AWS console and CLI output

  • Consistency: Standardized naming across all resources

  • Uniqueness: Account alias ensures global uniqueness

Consequences:

  • βœ… Human-readable resource names

  • βœ… Easier to identify resources in AWS console

  • βœ… Shorter CloudFormation stack names

  • ⚠️ Requires tenant id to be set in AWS account

3.3 Design PatternsΒΆ

Command Pattern:

  • Actions (create-vpc, delete-vpc, test-deploy, show-changes, check-drift, create-prov-template, validate-prov-template, create-policy, validate-config) mapped to methods

  • Centralized action dispatch in VPC Manager

  • Consistent error handling across all actions

Factory Pattern:

  • Dynamic CloudFormation template generation based on configuration

  • Customized VPC configurations for different environments

  • Reusable template components (subnets, route tables, gateways)

Strategy Pattern:

  • Different NAT Gateway strategies (standard vs. high-availability)

  • Configurable subnet allocation strategies

  • Flexible CIDR block calculation

Template Method Pattern:

  • Common provisioning workflow with customizable steps

  • Consistent validation β†’ generation β†’ deployment flow

  • Extensible for future resource types


4. Component ArchitectureΒΆ

4.1 CLI Interface (cli.py)ΒΆ

Responsibilities:

  • Parse command-line arguments

  • Initialize logging configuration

  • Load application configuration

  • Dispatch actions to VPC Manager

  • Handle exceptions and display user-friendly errors

Key Functions:

def main():
    # Parse arguments: --config, --action, --log-level
    # Initialize logging
    # Load app configuration
    # Create VPC Manager instance
    # Dispatch action
    # Handle exceptions

Supported Actions:

  • validate-config: Validate configuration without deployment

  • create-policy: Generate IAM policy only

  • create-prov-template: Generate CloudFormation template only

  • validate-prov-template: Validate generated template locally (no AWS calls)

  • show-changes: Preview pending changes via CloudFormation ChangeSet

  • check-drift: Detect infrastructure drift against deployed stack

  • test-deploy: Deploy with isolated test suffix for safe testing

  • create-vpc: Provision complete VPC infrastructure (requires --force)

  • delete-vpc: Delete VPC and all resources (requires --force)

Error Handling:

  • Configuration validation errors

  • AWS API errors (permissions, quotas, service issues)

  • CloudFormation stack failures

  • License validation failures

4.2 Configuration Loader (config/loader.py)ΒΆ

Responsibilities:

  • Load YAML configuration files

  • Validate against JSON schema

  • Flatten nested configuration into flat attributes

  • Compute derived values (VPC name, stack name, artifact paths)

  • Handle configuration defaults

Key Methods:

class ConfigLoader:
    def load_config(yaml_path: str) -> dict
    def validate_config(config: dict) -> bool
    def flatten_config(config: dict) -> ConfigAttributes
    def compute_vpc_name() -> str
    def compute_stack_name() -> str

Configuration Schema Validation:

  • Client information (company_name, company_prefix, account_id, tenant_id)

  • Environment (env, region, availability_zones)

  • VPC settings (cidr_block, enable_dns_hostnames, enable_dns_support)

  • Subnet configuration (public, private, database)

  • Gateway configuration (internet_gateway, nat_gateway)

  • Tags (cost_center, project, owner, environment)

Derived Values:

  • VPC Name: {company_prefix}-{environment}-{tenant_id}-{region}-vpc

  • Stack Name: {vpc_name}-stack

  • Log File: {vpc_name}-{action}-{timestamp}.log

  • Template File: {vpc_name}-template.yaml

  • Policy File: {vpc_name}-iam-policy.json

  • Report File: {vpc_name}-{action}-{timestamp}.html

4.3 VPC Manager (core/vpc_manager.py)ΒΆ

Responsibilities:

  • Orchestrate VPC provisioning workflow

  • Generate CloudFormation templates

  • Manage CloudFormation stacks (create, delete, wait)

  • Generate IAM policies

  • Create HTML reports

  • Manage artifact paths and file operations

Key Methods:

class VPCManager:
    def __init__(config_path: str, action: str)
    
    # Action dispatch
    def execute_action() -> None
    
    # VPC operations
    def _create_vpc() -> None
    def _delete_vpc() -> None
    def _create_prov_template() -> None
    def _create_policy() -> None
    
    # CloudFormation operations
    def _generate_cloudformation_template() -> dict
    def _create_cloudformation_stack() -> None
    def _delete_cloudformation_stack() -> None
    def _wait_for_stack_creation() -> None
    def _wait_for_stack_deletion() -> None
    
    # Artifact management
    def _setup_artifact_paths() -> None
    def _save_template(template: dict) -> None
    def _save_policy(policy: dict) -> None
    def _generate_html_report() -> None

CloudFormation Template Structure:

AWSTemplateFormatVersion: '2010-09-09'
Description: VPC infrastructure for {vpc_name}

Resources:
  VPC:
    Type: AWS::EC2::VPC
    Properties:
      CidrBlock: 10.0.0.0/16
      EnableDnsHostnames: true
      EnableDnsSupport: true
      Tags: [...]
  
  InternetGateway:
    Type: AWS::EC2::InternetGateway
  
  VPCGatewayAttachment:
    Type: AWS::EC2::VPCGatewayAttachment
  
  PublicSubnet1:
    Type: AWS::EC2::Subnet
  
  PrivateSubnet1:
    Type: AWS::EC2::Subnet
  
  DatabaseSubnet1:
    Type: AWS::EC2::Subnet
  
  NATGateway1:
    Type: AWS::EC2::NatGateway
  
  PublicRouteTable:
    Type: AWS::EC2::RouteTable
  
  PrivateRouteTable:
    Type: AWS::EC2::RouteTable
  
  DatabaseRouteTable:
    Type: AWS::EC2::RouteTable

Outputs:
  VPCId:
    Value: !Ref VPC
  PublicSubnetIds:
    Value: !Join [',', [!Ref PublicSubnet1, !Ref PublicSubnet2]]
  PrivateSubnetIds:
    Value: !Join [',', [!Ref PrivateSubnet1, !Ref PrivateSubnet2]]

Waiter Configuration:

waiter_config = {
    'Delay': 10,        # Poll every 10 seconds
    'MaxAttempts': 60   # Maximum 10 minutes
}

4.4 License Validator (license/validator.py)ΒΆ

Responsibilities:

  • Validate AWS Marketplace product code

  • Check license entitlements via AWS Marketplace Metering API

  • Handle license expiration and renewal

  • Report license status to user

Key Methods:

class LicenseValidator:
    def validate_license() -> bool
    def check_entitlement() -> dict
    def report_usage() -> None
    def handle_expiration() -> None

License Validation Flow:

  1. Read product code from environment variable

  2. Call AWS Marketplace Metering API

  3. Verify entitlement status

  4. Check expiration date

  5. Report usage metrics (if required)

  6. Return validation result

Error Handling:

  • Invalid product code

  • Expired license

  • No entitlement found

  • API communication errors

4.5 HTML Generator (utils/html_generator.py)ΒΆ

Responsibilities:

  • Generate HTML provisioning reports (template preview)

  • Generate HTML deployment reports (post-deployment summary)

  • Format CloudFormation templates for display

  • Create visual network topology diagrams

  • Include metadata (timestamps, user, region, VPC details)

Report Types:

Provisioning Report (create-template action):

  • Configuration summary

  • VPC details (name, CIDR, region)

  • Subnet layout (public, private, database)

  • Gateway configuration

  • CloudFormation template preview (formatted YAML)

  • Metadata (timestamp, user, action)

Deployment Report (create-vpc action):

  • All provisioning report content

  • CloudFormation stack details (stack ID, status)

  • Resource IDs (VPC ID, subnet IDs, gateway IDs)

  • Deployment timeline

  • Success/failure status

HTML Template Structure:

<!DOCTYPE html>
<html>
<head>
    <title>VPC Provisioning Report</title>
    <style>/* Professional CSS styling */</style>
</head>
<body>
    <header>
        <h1>VPC Provisioning Report</h1>
        <p>Generated: {timestamp}</p>
    </header>
    
    <section id="summary">
        <h2>Configuration Summary</h2>
        <!-- VPC details, subnets, gateways -->
    </section>
    
    <section id="template">
        <h2>CloudFormation Template</h2>
        <pre><code><!-- Formatted YAML --></code></pre>
    </section>
    
    <section id="resources">
        <h2>Deployed Resources</h2>
        <!-- Resource IDs, ARNs -->
    </section>
    
    <footer>
        <p>VPC Provisioner v1.0.0</p>
    </footer>
</body>
</html>

5. Data ArchitectureΒΆ

5.1 Data ModelΒΆ

Storage Strategy: Configuration-driven, no persistent database required

Data Sources:

  1. YAML Configuration Files: Client-provided network topology definitions

  2. JSON Schema Files: Configuration validation rules

  3. Application Config: Tool-level settings (logging, timeouts, defaults)

  4. AWS State: CloudFormation stack state (managed by AWS)

Configuration Structure:

# Client identification
client:
  company_name: "Axon Tech Labs"
  company_prefix: "edge"
  account_id: "123456789012"
  tenant_id: "a001"

# Environment and region
environment:
  env: "prod"
  region: "us-west-2"
  availability_zones:
    - "us-west-2a"
    - "us-west-2b"

# VPC configuration
vpc:
  cidr_block: "10.0.0.0/16"
  enable_dns_hostnames: true
  enable_dns_support: true
  
  # Subnet configuration
  subnets:
    public:
      - cidr: "10.0.1.0/24"
        az: "us-west-2a"
        name: "public-subnet-1"
      - cidr: "10.0.2.0/24"
        az: "us-west-2b"
        name: "public-subnet-2"
    
    private:
      - cidr: "10.0.11.0/24"
        az: "us-west-2a"
        name: "private-subnet-1"
      - cidr: "10.0.12.0/24"
        az: "us-west-2b"
        name: "private-subnet-2"
    
    database:
      - cidr: "10.0.21.0/26"
        az: "us-west-2a"
        name: "database-subnet-1"
      - cidr: "10.0.22.0/26"
        az: "us-west-2b"
        name: "database-subnet-2"
  
  # Gateway configuration
  internet_gateway:
    enabled: true
  
  nat_gateway:
    enabled: true
    high_availability: false

# Resource tagging
tags:
  cost_center: "Engineering"
  project: "Infrastructure"
  owner: "devops-team"
  environment: "prod"

5.2 Data FlowΒΆ

Provisioning Flow:

1. User provides YAML config
   ↓
2. ConfigLoader reads and parses YAML
   ↓
3. Schema validation against JSON schema
   ↓
4. Configuration flattened to attributes
   ↓
5. VPCManager generates CloudFormation template
   ↓
6. Template saved to file system
   ↓
7. boto3 creates CloudFormation stack
   ↓
8. CloudFormation provisions AWS resources
   ↓
9. VPCManager waits for stack completion
   ↓
10. HTML report generated with resource IDs
    ↓
11. Artifacts saved (logs, templates, policies, reports)

Data Transformations:

  • YAML β†’ Python dict (PyYAML)

  • Python dict β†’ Validated config (jsonschema)

  • Validated config β†’ CloudFormation template (Jinja2/dict)

  • CloudFormation template β†’ AWS resources (CloudFormation)

  • AWS resources β†’ HTML report (html_generator)

5.3 Artifact ManagementΒΆ

Artifact Types:

  1. Logs: Timestamped operation logs

  2. Templates: CloudFormation YAML templates

  3. Policies: IAM policy JSON documents

  4. Reports: HTML deployment reports

Naming Conventions:

  • Logs: {vpc_name}-{action}-{timestamp}.log

  • Templates: {vpc_name}-template.yaml

  • Policies: {vpc_name}-iam-policy.json

  • Reports: {vpc_name}-{action}-{timestamp}.html

Storage Locations:

  • Logs: reports/ directory

  • Templates: templates/ directory

  • Policies: policies/ directory

  • Reports: reports/ directory

Retention:

  • Logs: Retained indefinitely (user manages cleanup)

  • Templates: Retained for reuse and audit

  • Policies: Retained for IAM configuration

  • Reports: Retained for documentation and compliance


6. Security ArchitectureΒΆ

6.1 Security PrinciplesΒΆ

Defense in Depth:

  • Multiple layers of security controls

  • Network segmentation (public, private, database subnets)

  • IAM least privilege

  • Container isolation

  • Code protection (Cython compilation)

Least Privilege:

  • Minimal IAM permissions for VPC operations

  • Non-root container execution

  • Read-only credential mounts

  • Scoped AWS API access

Security by Design:

  • Private subnets by default for compute workloads

  • No direct internet access for database tier

  • Controlled outbound access via NAT Gateway

  • Mandatory resource tagging for governance

6.2 Application SecurityΒΆ

Code Protection:

  • Core modules compiled to .so files (Cython)

  • Binary obfuscation prevents reverse engineering

  • Intellectual property protection

  • Tamper resistance

Credential Management:

  • No hardcoded credentials in code or containers

  • AWS credentials mounted read-only at runtime

  • Support for IAM roles (EC2, ECS, Lambda)

  • Support for AWS SSO and temporary credentials

Input Validation:

  • JSON schema validation for all configurations

  • CIDR block validation

  • Availability zone validation

  • Tag key/value validation

  • Path traversal prevention

Dependency Security:

  • Pinned dependency versions (uv.lock)

  • Regular security scanning (Trivy)

  • Minimal base image (python:3.13-slim)

  • No unnecessary packages

6.3 Container SecurityΒΆ

Non-Root Execution:

RUN useradd -m -u 1000 vpcuser
USER vpcuser

Read-Only Mounts:

-v ~/.aws:/home/vpcuser/.aws:ro  # Read-only credentials
-v ./configs:/app/configs:ro      # Read-only configurations

Health Checks:

HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD python -c "import sys; sys.exit(0)" || exit 1

OCI Metadata:

LABEL org.opencontainers.image.title="VPC Provisioner"
LABEL org.opencontainers.image.vendor="Axon Tech Labs"
LABEL org.opencontainers.image.version="1.0.0"

Security Scanning:

  • Trivy vulnerability scanning before deployment

  • Automated scanning in CI/CD pipeline

  • Known vulnerabilities documented (CVE-2025-69720, CVE-2026-29111 in base image)

6.4 Network SecurityΒΆ

Three-Tier Architecture:

Public Subnets (DMZ):

  • Internet-facing resources only (load balancers, bastion hosts)

  • Route to Internet Gateway (0.0.0.0/0 β†’ IGW)

  • Network ACLs restrict inbound traffic

  • Security groups enforce least privilege

Private Subnets (Application Tier):

  • No direct internet access

  • Outbound internet via NAT Gateway

  • No inbound internet access

  • Internal communication only

Database Subnets (Data Tier):

  • Completely isolated from internet

  • No inbound or outbound internet access

  • Access only from private subnets

  • Smallest CIDR blocks (/26)

Route Table Isolation:

Public Route Table:
  0.0.0.0/0 β†’ Internet Gateway
  10.0.0.0/16 β†’ local

Private Route Table:
  0.0.0.0/0 β†’ NAT Gateway
  10.0.0.0/16 β†’ local

Database Route Table:
  10.0.0.0/16 β†’ local (no internet route)

6.5 IAM SecurityΒΆ

Least Privilege Policies:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:CreateVpc",
        "ec2:DescribeVpcs",
        "ec2:CreateSubnet",
        "ec2:DescribeSubnets",
        "ec2:CreateInternetGateway",
        "ec2:AttachInternetGateway",
        "ec2:CreateNatGateway",
        "ec2:CreateRouteTable",
        "ec2:CreateRoute",
        "ec2:AssociateRouteTable",
        "ec2:CreateTags"
      ],
      "Resource": "*",
      "Condition": {
        "StringEquals": {
          "aws:RequestedRegion": "us-west-2"
        }
      }
    },
    {
      "Effect": "Allow",
      "Action": [
        "cloudformation:CreateStack",
        "cloudformation:DescribeStacks",
        "cloudformation:DeleteStack",
        "cloudformation:DescribeStackEvents"
      ],
      "Resource": "arn:aws:cloudformation:us-west-2:123456789012:stack/edge-prod-a001-us-west-2-vpc-stack/*"
    }
  ]
}

Resource Scoping:

  • Region-specific permissions

  • Stack name pattern matching

  • Tag-based access control

  • Time-based access (optional)

6.6 Compliance & AuditΒΆ

Audit Logging:

  • All operations logged with timestamps

  • User identity captured (AWS IAM principal)

  • Configuration changes tracked

  • CloudFormation events logged

Compliance Support:

  • SOC 2: Audit trails, access controls, encryption

  • ISO 27001: Security controls, risk management

  • HIPAA: Network isolation, encryption, audit logs

  • PCI-DSS: Network segmentation, access controls

Resource Tagging:

tags:
  cost_center: "Engineering"
  project: "Infrastructure"
  owner: "devops-team"
  environment: "prod"
  compliance: "sox"
  data_classification: "internal"

6.7 Known VulnerabilitiesΒΆ

CVE-2025-69720 (ncurses buffer overflow):

  • Severity: HIGH

  • Component: ncurses in python:3.13-slim base image (Debian 13.4)

  • Status: Upstream vulnerability, no fix available

  • Impact: Buffer overflow that may lead to arbitrary code execution

  • Mitigation: Using official Docker image, monitoring for updates

CVE-2026-29111 (systemd arbitrary code execution):

  • Severity: HIGH

  • Component: systemd in python:3.13-slim base image (Debian 13.4)

  • Status: Upstream vulnerability, no fix available

  • Impact: Arbitrary code execution or DoS via spurious IPC

  • Mitigation: Using official Docker image, monitoring for updates

Documentation: See SECURITY.md for full details


7. Deployment ArchitectureΒΆ

7.1 Deployment ModelsΒΆ

Local Execution (Development):

# Direct Python execution
python -m vpc_provisioner.cli \
  --config configs/edge-prod-a001-us-west-2-vpc.yaml \
  --action create-vpc

Docker Execution (Production):

# Docker container execution
docker run --rm \
  -v ~/.aws:/home/vpcuser/.aws:ro \
  -v $(pwd)/configs:/app/configs:ro \
  -v $(pwd)/reports:/app/reports \
  vpc-provisioner:latest \
  --config edge-prod-a001-us-west-2-vpc.yaml \
  --action create-vpc

CI/CD Pipeline (Automation):

# GitLab CI example
deploy-vpc:
  stage: deploy
  image: vpc-provisioner:latest
  script:
    - vpc-provisioner --config $CONFIG_FILE --action create-vpc
  only:
    - main

7.2 Docker ArchitectureΒΆ

Multi-Stage Build:

# Stage 1: Builder (dependency installation)
FROM python:3.13-slim AS builder
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
WORKDIR /build
COPY pyproject.toml uv.lock ./
RUN uv sync --no-install-project

# Stage 2: Runtime (minimal image)
FROM python:3.13-slim
RUN useradd -m -u 1000 vpcuser
WORKDIR /app
COPY --from=builder /build/.venv/lib/python3.13/site-packages /usr/local/lib/python3.13/site-packages
COPY src/vpc_provisioner ./vpc_provisioner
COPY templates ./templates
COPY schemas ./schemas
USER vpcuser
ENTRYPOINT ["/app/entrypoint.sh"]

Benefits:

  • Smaller final image (no build tools)

  • Faster deployment (cached layers)

  • Improved security (minimal attack surface)

7.3 Infrastructure RequirementsΒΆ

Host Requirements:

  • Docker 20.10+ or compatible runtime

  • AWS credentials configured (IAM role, access keys, or SSO)

  • Network access to AWS API endpoints (HTTPS/443)

  • Sufficient disk space for logs and reports

AWS Requirements:

  • AWS account with VPC service enabled

  • IAM permissions for VPC and CloudFormation operations

  • Service quotas: VPCs (5 per region default), Subnets (200 per VPC)

  • Elastic IPs for NAT Gateways (if HA enabled)

Network Requirements:

  • Outbound HTTPS (443) to AWS API endpoints

  • No inbound connections required

  • Proxy support (via AWS SDK environment variables)

7.4 ScalabilityΒΆ

Horizontal Scaling:

  • Multiple VPCs across regions (parallel execution)

  • Multiple environments per client (dev, staging, prod)

  • Multiple clients (configuration-driven)

Vertical Scaling:

  • Larger CIDR blocks (more IP addresses)

  • Additional subnets within VPC

  • More availability zones

  • VPC endpoints for AWS services

Limits:

  • AWS VPC service limits (5 VPCs per region default, can be increased)

  • CloudFormation stack limits (500 resources per stack)

  • Subnet limits (200 per VPC)

  • Route table limits (200 per VPC)

7.5 High AvailabilityΒΆ

Application HA:

  • Stateless application (no local state)

  • Idempotent operations (safe to retry)

  • CloudFormation handles resource state

Infrastructure HA:

  • Multi-AZ subnet distribution

  • Optional HA NAT Gateway (NAT per AZ)

  • Internet Gateway (AWS-managed, HA by default)

  • CloudFormation (AWS-managed, HA by default)

Failure Scenarios:

  • Single AZ failure: Resources in other AZs remain available

  • NAT Gateway failure: HA mode provides redundancy, standard mode requires manual intervention

  • CloudFormation failure: Automatic rollback to previous state

  • Application failure: Retry operation (idempotent)

7.6 Disaster RecoveryΒΆ

Recovery Time Objective (RTO): < 10 minutes

  • Redeploy VPC from configuration file

  • CloudFormation recreates all resources

  • No data loss (infrastructure is stateless)

Recovery Point Objective (RPO): 0 (no data loss)

  • Configuration files version-controlled in Git

  • CloudFormation templates stored in S3 (optional)

  • Audit logs retained for compliance

Backup Strategy:

  • Configuration files: Git repository

  • CloudFormation templates: File system + S3

  • Audit logs: File system + CloudWatch Logs

  • No application state to backup

Recovery Procedure:

  1. Retrieve configuration from Git

  2. Execute create-vpc action

  3. CloudFormation recreates all resources

  4. Verify VPC and subnet IDs

  5. Update application configurations with new IDs


8. Quality AttributesΒΆ

8.1 PerformanceΒΆ

Provisioning Time:

  • VPC creation: 3-5 minutes (typical)

  • Template generation: < 5 seconds

  • Configuration validation: < 1 second

  • IAM policy generation: < 1 second

Factors Affecting Performance:

  • Number of subnets (more subnets = longer provisioning)

  • NAT Gateway configuration (HA mode adds 1-2 minutes)

  • AWS API rate limits (throttling)

  • CloudFormation stack size

Optimization Strategies:

  • Parallel resource creation (CloudFormation handles)

  • Efficient waiter configuration (10s polling interval)

  • Minimal API calls (CloudFormation batch operations)

  • Cython compilation for faster execution

Performance Monitoring:

@timed_operation
def create_vpc():
    # Execution time automatically logged
    pass

8.2 ReliabilityΒΆ

Success Rate: 99.9% (target)

Error Handling:

  • Configuration validation before deployment

  • CloudFormation automatic rollback on failure

  • Retry logic for transient AWS API errors

  • Comprehensive error messages

Idempotency:

  • CloudFormation ensures idempotent operations

  • Re-running with same configuration converges to desired state

  • No duplicate resource creation

Failure Modes:

  1. Configuration Error: Caught during validation, no AWS resources created

  2. AWS API Error: Boto3 retries with exponential backoff

  3. CloudFormation Failure: Automatic rollback to previous state

  4. Timeout: Waiter configuration prevents indefinite waiting

Recovery:

  • Failed stacks automatically rolled back

  • Retry operation after fixing configuration

  • Manual cleanup if rollback fails (rare)

8.3 MaintainabilityΒΆ

Code Organization:

  • Modular design (config, core, license, utils)

  • Shared utilities in common library

  • Clear separation of concerns

  • Consistent naming conventions

Documentation:

  • Comprehensive user guide

  • Architecture documentation (this document)

  • API documentation (docstrings)

  • Configuration examples

  • Troubleshooting guide

Testing:

  • Unit tests for core logic

  • Integration tests for AWS operations

  • Configuration validation tests

  • End-to-end deployment tests

Versioning:

  • Semantic versioning (MAJOR.MINOR.PATCH)

  • Git tags for releases

  • Changelog documentation

  • Backward compatibility policy

8.4 UsabilityΒΆ

Ease of Use:

  • Simple CLI interface (–config, –action)

  • Clear error messages

  • Progress feedback during operations

  • HTML reports for documentation

Configuration:

  • Human-readable YAML format

  • Comprehensive examples

  • Schema validation with helpful errors

  • Sensible defaults

Documentation:

  • Quick start guide

  • Step-by-step tutorials

  • Configuration reference

  • Troubleshooting guide

  • FAQ

User Experience:

# Simple command
vpc-provisioner --config my-vpc.yaml --action create-vpc

# Clear output
βœ“ Configuration validated
βœ“ CloudFormation template generated
βœ“ Stack creation initiated
⏳ Waiting for stack creation (this may take 3-5 minutes)
βœ“ VPC created successfully
βœ“ Report generated: reports/edge-prod-a001-us-west-2-vpc-deployment-20250131.html

8.5 PortabilityΒΆ

Platform Support:

  • Linux (primary): Ubuntu, RHEL, Amazon Linux

  • macOS: Development and testing

  • Windows: Via WSL2 or Docker

Container Portability:

  • Docker 20.10+ on any platform

  • Kubernetes (via Docker image)

  • AWS ECS/Fargate

  • AWS Lambda (with container support)

Cloud Portability:

  • AWS-specific (VPC is AWS service)

  • Multi-region support within AWS

  • Multi-account support

8.6 ExtensibilityΒΆ

Future Enhancements:

  • VPC peering support

  • Transit Gateway integration

  • VPC endpoints for AWS services

  • Custom route table rules

  • Network ACL customization

  • VPC Flow Logs configuration

  • Additional subnet tiers

Plugin Architecture (Future):

  • Custom validators

  • Custom template generators

  • Custom report formats

  • Integration with external tools

API Extensibility:

  • Shared utilities for reuse

  • Common patterns for new provisioners

  • Extensible configuration schema


9. Integration ArchitectureΒΆ

9.1 AWS Service IntegrationΒΆ

AWS CloudFormation:

  • Purpose: Infrastructure as Code engine

  • Integration: boto3 CloudFormation client

  • Operations: CreateStack, DeleteStack, DescribeStacks, DescribeStackEvents

  • Authentication: AWS credentials (IAM role, access keys, SSO)

Amazon VPC:

  • Purpose: Virtual Private Cloud service

  • Integration: Via CloudFormation (declarative)

  • Resources: VPC, Subnets, Route Tables, Internet Gateway, NAT Gateway

  • Validation: DescribeVpcs for verification

AWS IAM:

  • Purpose: Identity and access management

  • Integration: boto3 IAM client (for policy generation)

  • Operations: GetUser, GetRole (for validation)

  • Permissions: Least privilege policies

AWS Marketplace:

  • Purpose: License validation

  • Integration: boto3 License Manager client

  • Operations: CheckoutLicense, CheckInLicense

  • Authentication: Product SKU with key fingerprint verification

AWS CloudWatch:

  • Purpose: Logging and monitoring

  • Integration: boto3 CloudWatch Logs client (optional)

  • Operations: PutLogEvents, CreateLogGroup

  • Use Case: Centralized logging

9.2 External Tool IntegrationΒΆ

Git:

  • Purpose: Version control for configurations

  • Integration: Manual (user manages Git repository)

  • Use Case: Configuration versioning and collaboration

CI/CD Pipelines:

  • Supported: GitLab CI, GitHub Actions, Jenkins, AWS CodePipeline

  • Integration: Docker image in pipeline

  • Use Case: Automated VPC provisioning

Terraform (Interoperability):

  • Integration: CloudFormation outputs can be imported to Terraform

  • Use Case: Hybrid infrastructure management

Ansible (Interoperability):

  • Integration: Ansible can invoke Docker container

  • Use Case: Orchestration of multiple provisioning tasks

9.3 API Integration PatternsΒΆ

Synchronous Operations:

  • Configuration validation

  • Template generation

  • IAM policy generation

  • HTML report generation

Asynchronous Operations:

  • CloudFormation stack creation (waiter pattern)

  • CloudFormation stack deletion (waiter pattern)

Retry Logic:

# Boto3 automatic retry with exponential backoff
config = Config(
    retries={
        'max_attempts': 10,
        'mode': 'adaptive'
    }
)
client = boto3.client('cloudformation', config=config)

Error Handling:

try:
    response = cfn_client.create_stack(...)
except ClientError as e:
    error_code = e.response['Error']['Code']
    if error_code == 'AlreadyExistsException':
        # Handle duplicate stack
    elif error_code == 'LimitExceededException':
        # Handle quota limit
    else:
        # Handle other errors

9.4 Data Exchange FormatsΒΆ

Input Formats:

  • YAML: Configuration files

  • JSON: Schema validation rules

  • Environment Variables: AWS credentials, product code

Output Formats:

  • YAML: CloudFormation templates

  • JSON: IAM policies

  • HTML: Deployment reports

  • Plain Text: Log files

API Formats:

  • JSON: AWS API requests/responses (boto3)

  • XML: CloudFormation template format (optional)


10. Operational ArchitectureΒΆ

10.1 Monitoring & ObservabilityΒΆ

Application Logging:

# Structured logging
logger.info("VPC creation initiated", extra={
    'vpc_name': vpc_name,
    'region': region,
    'action': 'create-vpc'
})

Log Levels:

  • DEBUG: Detailed diagnostic information

  • INFO: General informational messages

  • WARNING: Warning messages (non-critical issues)

  • ERROR: Error messages (operation failures)

Log Destinations:

  • File system: reports/{vpc_name}-{action}-{timestamp}.log

  • Console: Real-time feedback to user

  • CloudWatch Logs: Centralized logging (optional)

Metrics (Future):

  • Provisioning success rate

  • Average provisioning time

  • Error rate by type

  • Resource count by region

Tracing (Future):

  • AWS X-Ray integration

  • Request ID tracking

  • End-to-end operation tracing

10.2 AlertingΒΆ

CloudWatch Alarms (Optional):

  • High error rate

  • Slow provisioning time

  • VPC quota limits

  • NAT Gateway failures

Email Notifications:

  • Provisioning success/failure

  • License expiration warnings

  • Quota limit warnings

Slack/Teams Integration (Future):

  • Real-time notifications

  • Deployment status updates

10.3 Backup & RecoveryΒΆ

Configuration Backup:

  • Git repository (primary)

  • S3 bucket (optional)

  • Local file system

Template Backup:

  • File system: templates/ directory

  • S3 bucket (optional)

  • Git repository (optional)

Log Retention:

  • Local: Indefinite (user manages)

  • CloudWatch: 30 days (configurable)

  • S3: Lifecycle policies

Recovery Procedures:

  • See BACKUP_RECOVERY.md

10.4 MaintenanceΒΆ

Updates:

  • Application updates: Docker image tags

  • Dependency updates: uv lock file

  • Base image updates: Rebuild Docker image

Patching:

  • Security patches: Immediate rebuild and deployment

  • Bug fixes: Regular release cycle

  • Feature updates: Quarterly releases

Deprecation Policy:

  • 6-month notice for breaking changes

  • Migration guides provided

  • Backward compatibility maintained for 2 major versions

10.5 Support & TroubleshootingΒΆ

Support Channels:

  • Documentation: Comprehensive guides

  • Email: support@axontechlabs.com

  • Issue Tracker: GitHub/GitLab issues

Troubleshooting Resources:

  • TROUBLESHOOTING.md

  • USER_GUIDE.md

  • IAM_PERMISSIONS.md

Common Issues:

  1. AWS credential errors β†’ Check IAM permissions

  2. Configuration validation errors β†’ Review schema

  3. CloudFormation failures β†’ Check AWS quotas

  4. Timeout errors β†’ Increase waiter max attempts

Diagnostic Commands:

# Validate configuration
vpc-provisioner --config my-vpc.yaml --action validate-config

# Check AWS credentials
aws sts get-caller-identity

# Check VPC quotas
aws service-quotas get-service-quota \
  --service-code vpc \
  --quota-code L-F678F1CE

11. Future RoadmapΒΆ

11.1 Planned FeaturesΒΆ

Short Term (Q1 2025):

  • VPC Flow Logs configuration

  • VPC endpoints for AWS services (S3, DynamoDB)

  • Custom Network ACL rules

  • Enhanced HTML reports with network diagrams

Medium Term (Q2-Q3 2025):

  • VPC peering support

  • Transit Gateway integration

  • Multi-VPC deployments

  • Terraform output generation

Long Term (Q4 2025+):

  • AWS Control Tower integration

  • Service Catalog integration

  • Custom IPAM integration

  • Advanced network topology (hub-and-spoke)

11.2 Technical DebtΒΆ

Current:

  • Limited unit test coverage (target: 80%)

  • Manual integration testing (automate in CI/CD)

  • Hardcoded waiter configuration (make configurable)

Planned Improvements:

  • Comprehensive test suite

  • Performance benchmarking

  • Load testing for concurrent operations

  • Enhanced error messages


12. AppendicesΒΆ

12.1 GlossaryΒΆ

  • VPC: Virtual Private Cloud - Isolated network environment in AWS

  • CIDR: Classless Inter-Domain Routing - IP address allocation method

  • NAT Gateway: Network Address Translation gateway for outbound internet access

  • IGW: Internet Gateway - Provides internet connectivity for VPC

  • AZ: Availability Zone - Isolated data center within AWS region

  • CloudFormation: AWS Infrastructure as Code service

  • IAM: Identity and Access Management - AWS authentication and authorization

  • Waiter: Polling mechanism for asynchronous AWS operations

12.2 ReferencesΒΆ

AWS Documentation:

Internal Documentation:

  • USER_GUIDE.md - User documentation

  • SECURITY.md - Security policy and known vulnerabilities

  • TROUBLESHOOTING.md - Troubleshooting guide

  • IAM_PERMISSIONS.md - IAM permission requirements

12.3 Version HistoryΒΆ

Version

Date

Changes

1.0.0

2025-01-31

Initial production release

1.0.0

2025-01-31

Comprehensive architecture documentation


Document Metadata:

  • Author: VPC Provisioner Team

  • Last Updated: 2025-01-31

  • Next Review: 2025-04-30

  • Status: Production Ready

  • Classification: Internal Use