User GuideΒΆ
Complete reference for all commands and features.
Table of ContentsΒΆ
PrerequisitesΒΆ
Planning Your DeploymentΒΆ
Before installing tools or configuring credentials, understand the key decisions youβll need to make:
Bucket Naming: You can use auto-generated bucket names based on your organizationβs naming convention, or provide custom names. This affects how you identify and organize your S3 resources.
Deployment Pattern: You can deploy multiple ML solutions to a single shared bucket, or create separate buckets for each solution. This affects cost, management complexity, and access control.
Detailed guidance on these decisions is provided in Bucket Naming and Deployment Strategy sections.
Docker InstallationΒΆ
This tool requires Docker to run. Verify Docker is installed:
docker --version
If Docker is installed you should get the line similar to this: Docker version 29.2.1, build a5c7197
If you get a βcommand not foundβ error, Docker is likely not installed or not in your PATH.
If Docker is not installed follow the steps:
Linux (Ubuntu/Debian):
If Docker is not installed, follow these steps:
**Linux (NOT WSL2):**
```bash
# Update package index
sudo apt update
# Install prerequisites
sudo apt install -y apt-transport-https ca-certificates curl software-properties-common
# Add Docker's official GPG key
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
# Add Docker repository
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
# Update package index again
sudo apt update
# Install Docker
sudo apt install -y docker-ce docker-ce-cli containerd.io
# Start and enable Docker
sudo systemctl start docker
sudo systemctl enable docker
# Add user to docker group (optional, to run without sudo)
sudo usermod -aG docker $USER
newgrp docker
macOS: Download and install Docker Desktop for Mac
Windows: Download and install Docker Desktop for Windows
Verify Docker installation:
docker run hello-world
Python Installation (Optional - for AWS CLI)ΒΆ
Python 3.8+ is required if you plan to use AWS CLI for credential configuration.
Verify Python is installed:
python3 --version
Expected output:
Python 3.8.10 (or higher)
If Python is not installed:
Linux (Ubuntu/Debian):
sudo apt update
sudo apt install -y python3 python3-pip
macOS:
# Python 3 comes pre-installed on macOS 10.15+
# Or install via Homebrew:
brew install python3
Windows:
Download and install from python.org
During installation, check βAdd Python to PATHβ
Note: The S3 Provisioner tool itself runs in Docker and does not require Python on your host system. Python is only needed if you want to use AWS CLI for credential management (Method 1 in the next section).
AWS Credentials ConfigurationΒΆ
This tool requires AWS credentials to provision S3 buckets and CloudFormation stacks.
Verify AWS credentials are configured:
aws sts get-caller-identity
Expected output:
{
"UserId": "AIDAXXXXXXXXXXXXXXXXX",
"Account": "123456789012",
"Arn": "arn:aws:iam::123456789012:user/your-username"
}
If you get an error like βUnable to locate credentialsβ, configure AWS using one of the following methods:
Method 1: AWS CLI (Recommended)
# Install AWS CLI if not already installed
# Linux/macOS:
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
# Configure credentials
aws configure
# You'll be prompted for:
# AWS Access Key ID: [Your access key]
# AWS Secret Access Key: [Your secret key]
# Default region name: [e.g., us-west-1]
# Default output format: [json]
Method 2: Environment Variables
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_DEFAULT_REGION="us-west-1"
Method 3: IAM Role (For EC2/ECS)
If running on AWS EC2 or ECS, attach an IAM role with appropriate permissions. For detailed IAM permissions required, see IAM Permissions section.
No manual configuration needed
System RequirementsΒΆ
Disk Space:
Docker image: ~500 MB
Working directory: ~50 MB (for configs, templates, policies, reports)
Minimum recommended: 1 GB free space
Network Access:
Internet connection to pull Docker image from AWS Marketplace
AWS API endpoints accessibility
If behind corporate firewall/proxy, ensure Docker and AWS CLI can reach external endpoints
Operating System:
Linux: Ubuntu 20.04+, Debian 10+, RHEL 8+, or equivalent
macOS: 10.15+ (Catalina or later)
Windows: Windows 10/11 with WSL2 (for Docker Desktop integration)
Docker Version:
Docker Engine 20.10+ or Docker Desktop 4.0+
AWS CLI Version:
AWS CLI v2 recommended (v1 also supported)
IAM Permissions CheckΒΆ
Verify your AWS user/role has necessary permissions:
# Check current identity
aws sts get-caller-identity
# Test S3 permissions
aws s3 ls
# Test CloudFormation permissions
aws cloudformation list-stacks --stack-status-filter CREATE_COMPLETE
For detailed IAM permissions required, see IAM Permissions section.
Minimum required permissions:
S3: CreateBucket, PutBucketPolicy, PutBucketVersioning, PutLifecycleConfiguration
CloudFormation: CreateStack, UpdateStack, DeleteStack, DescribeStacks
IAM: PassRole (if using VPC endpoints)
AWS Account LimitsΒΆ
S3 Bucket Limit:
Default: 100 buckets per account
View current buckets:
aws s3 lsRequest increase via AWS Support if needed
CloudFormation Stack Limit:
Default: 200 stacks per region
View current stacks:
aws cloudformation list-stacks --stack-status-filter CREATE_COMPLETE UPDATE_COMPLETERequest increase via AWS Support if needed
Bucket Name Availability:
S3 bucket names are globally unique across all AWS accounts
Tool will fail if bucket name already exists
Choose unique names or use auto-generated naming
Docker Image AvailabilityΒΆ
The S3 Provisioner tool is distributed as a Docker container image through AWS Marketplace.
Pull the image from AWS Marketplace:
# Pull the latest version from AWS Marketplace container registry
docker pull public.ecr.aws/<marketplace-id>/s3-provisioner:latest
# Tag it locally for convenience
docker tag public.ecr.aws/<marketplace-id>/s3-provisioner:latest s3-provisioner:latest
Note: Replace <marketplace-id> with the actual registry path provided in your AWS Marketplace subscription details.
Verify the image is available:
docker images s3-provisioner
Expected output:
REPOSITORY TAG IMAGE ID CREATED SIZE
s3-provisioner latest abc123def456 2 days ago 150MB
Security Scanning (Optional)ΒΆ
You can scan the S3 Provisioner Docker image for security vulnerabilities using Trivy or any container scanning tool your organization uses.
Install Trivy:
# macOS
brew install trivy
# Linux
sudo apt-get install trivy
Run scan:
trivy image s3-provisioner:latest
Filter by severity:
trivy image --severity HIGH,CRITICAL s3-provisioner:latest
Note: OS-level vulnerabilities in the base image (Debian) are common and typically have no fixed version available yet. Python package vulnerabilities are more actionable β report any findings to support.
Important: The Docker image is a pre-built, tested product. Do not attempt to rebuild or modify the image as this will void support and may introduce security vulnerabilities.
Step 1: Understand Bucket NamingΒΆ
S3 bucket names must be globally unique across all AWS accounts and meet AWS naming requirements: 3β63 characters, lowercase letters, numbers, hyphens (-), or periods (.) only. Must start and end with a letter or number, cannot contain consecutive periods, and cannot be formatted as an IP address.
You have two options for naming your buckets:
Option A: Auto-Generated Names
The tool constructs bucket names from your configuration:
{company_prefix}-{env}-{tenant_id}-{region}
Example: edge-prod-b001-us-west-1-s3
Field |
Description |
Example |
Constraints |
|---|---|---|---|
|
Company identifier |
βedgeβ |
Max 10 characters |
|
Environment |
βprodβ |
dev/staging/prod/test |
|
AWS account identifier |
βa001β |
1-4 alphanumeric characters |
|
AWS region |
βus-west-1β |
Valid AWS region |
Option B: Custom Names
Provide your own bucket name using bucket_name_override in the configuration:
s3:
bucket_name_override: "my-company-ml-prod-bucket"
When to use each option:
Auto-generated: Consistent naming across multiple buckets, easier to manage at scale
Custom names: Specific naming requirements, existing naming conventions, single bucket deployments
Decision Point: Will you use auto-generated names or provide custom names?
Choose Deployment StrategyΒΆ
You need to decide how to organize your ML solutions in S3 buckets. There are two main patterns:
Pattern B: Dedicated Buckets (One Solution Per Bucket)ΒΆ
Create a separate S3 bucket for each ML solution.
edge-prod-b001-us-west-1-s3-customer-churn/
βββ data/
βββ models/
βββ notebooks/
βββ ...
edge-prod-b001-us-west-1-s3-fraud-detection/
βββ data/
βββ models/
βββ notebooks/
βββ ...
edge-prod-b001-us-west-1-s3-recommendation/
βββ data/
βββ models/
βββ notebooks/
βββ ...
Advantages:
Strong isolation between solutions
Independent access controls per solution
Easier to manage solution-specific policies
Can delete entire solution cleanly
Better for compliance and security requirements
Disadvantages:
Higher cost (multiple buckets)
More buckets to manage and monitor
Can hit AWS bucket limits (default 100 per account)
More complex cross-solution resource sharing
Best for: Regulated industries, multi-tenant environments, solutions with different security requirements, independent solution lifecycles
Comparison SummaryΒΆ
Factor |
Pattern A (Shared) |
Pattern B (Dedicated) |
|---|---|---|
Cost |
Lower |
Higher |
Management |
Simpler |
More complex |
Isolation |
Low |
High |
Access Control |
Bucket-level |
Solution-level |
Scalability |
High (many solutions) |
Limited (bucket quota) |
Security |
Shared policies |
Independent policies |
Decision Point: Will you use a shared bucket for multiple solutions (Pattern A) or dedicated buckets per solution (Pattern B)?
Quick StartΒΆ
Pre-Deployment ChecklistΒΆ
Before deploying, ensure you have:
Docker 20.10+ installed and running (
docker --version)AWS credentials configured (
aws sts get-caller-identityworks)AWS Marketplace subscription active for S3 Provisioner
IAM permissions verified (see IAM_PERMISSIONS.md)
Configuration file created and validated
Bucket names verified as globally unique (S3 bucket names are unique across all AWS accounts)
Reviewed generated CloudFormation template
Tested in dev/staging environment first
SetupΒΆ
This section provides two complete workflows based on your deployment strategy choice from Step 2. Both patterns can coexist for the same environment, tenant, and AWS region. Choose the workflow that matches your needs.
Note: This tool provisions S3 folder structures for your ML solutions. You will develop and deploy your actual ML solutions separately.
Setup (Required for Both Patterns)ΒΆ
Create the working directory structure and copy documentation from the Docker image.
cd ~
mkdir -p mlops-infra-suite/s3/{configs,policies,reports,templates}
cd mlops-infra-suite
mkdir -p s3/docs
# Create a temporary container
container_id=$(docker create s3-provisioner:latest)
# Copy docs from container to host
docker cp $container_id:/app/docs/. s3/docs/
# Remove the temporary container
docker rm $container_id
ls s3/docs/
The following directory tree is created and the documentation files are copied.
βββ mlops-infra-suite
Β Β βββ s3
Β Β Β Β βββ configs
βββ docs
βΒ Β βββ _sources/
βΒ Β βββ _static/
β βββ APPLICATION_ARCHITECTURE.html
β βββ BACKUP_RECOVERY.html
β βββ CONFIGURATION.html
β βββ GOVERNANCE_COMPLIANCE.html
β βββ IAM_PERMISSIONS.html
β βββ ML_LIFECYCLE_POLICIES.html
β βββ MONITORING_HEALTH_CHECKS.html
β βββ README.html
β βββ RELEASE_NOTES.html
β βββ ROADMAP.html
β βββ S3_FOLDERS.html
β βββ SECURITY.html
β βββ SECURITY_GUIDELINES.html
β βββ SUPPORT.html
β βββ TROUBLESHOOTING.html
β βββ UPDATE_PROCEDURES.html
β βββ USER_GUIDE.html
β βββ _sources
β βββ _static
β βββ api_reference.html
β βββ genindex.html
β βββ index.html
β βββ objects.inv
β βββ py-modindex.html
β βββ search.html
β βββ searchindex.js
Β Β Β βββ policies
Β Β Β Β βββ reports
Β Β Β Β βββ templates
Open s3/docs/index.html in your browser to view the documentation.β
Pattern B: Dedicated Bucket WorkflowΒΆ
Provision a separate S3 bucket with folder structure for each ML solution.
1. Create Configuration for First Solution
Create s3/configs/edge-prod-b001-us-west-1-s3-customer-churn-s3.yaml with custom bucket name including solution name.
client:
company_name: Edge Corp
company_prefix: edge
account_id: "123456789012"
tenant_id: "a001"
environment:
env: prod
region: us-west-1
s3:
bucket_name_override: "edge-prod-b001-us-west-1-s3-customer-churn"
versioning: true
lifecycle_policy: ml-optimized
vpc_id: ""
route_table_ids: ""
tags:
Purpose: Customer Churn ML Solution
ManagedBy: CloudFormation
2. Validate Configuration
Verify the configuration file is valid.
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-customer-churn-s3.yaml \
--action validate-config
3. Create IAM Policy
Generate the IAM policy JSON file for this bucket.
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/policies:/app/policies \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-customer-churn-s3.yaml \
--action create-policy
4. Generate CloudFormation Template
Create the CloudFormation template for this solutionβs bucket.
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-customer-churn-s3.yaml \
--action create-prov-template \
--solution master-solution
5. Provision Bucket and Solution Folder Structure
Deploy the dedicated bucket with the ML solution folder structure.
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-customer-churn-s3.yaml \
--action create-bucket \
--solution customer-churn \
--force
6. Repeat for Additional Solutions
For each additional ML solution, create a new configuration file with a different bucket name and repeat steps 1-5.
Example for fraud-detection solution:
# s3/configs/edge-prod-b001-us-west-1-s3-fraud-detection-s3.yaml
s3:
bucket_name_override: "edge-prod-b001-us-west-1-s3-fraud-detection"
# ... rest of config
Then run steps 2-5 with --config edge-prod-b001-us-west-1-s3-fraud-detection-s3.yaml.
CleanupΒΆ
To remove a deployment, use the tear-down action.
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config <your-config-file>.yaml \
--action tear-down \
--force
Deployment PatternsΒΆ
The S3 Provisioner supports two deployment patterns:
Pattern A: Shared Bucket (Multiple Solutions)ΒΆ
Use Case: Cost-effective deployment where multiple ML solutions share a single S3 bucket.
Configuration: Single config file without bucket_name_override
Example: edge-prod-b001-us-west-1-s3.yaml
client:
company_name: Edge Corp
company_prefix: edge
account_id: "123456789012"
tenant_id: "a001"
environment:
env: prod
region: us-west-1
s3:
bucket_name_override: "" # Empty = auto-generated name
versioning: true
lifecycle_policy: ml-optimized
Bucket Name: edge-prod-b001-us-west-1-s3
Workflow:
# 1. Validate, create policy, generate template
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action validate-config
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/policies:/app/policies \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action create-policy
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action create-prov-template \
--solution master-solution
# 2. Create bucket and prepare master structure
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action prep-master \
--solution master-solution \
--force
# 3. Deploy multiple solutions to same bucket
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action deploy-solution \
--solution customer-churn
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action deploy-solution \
--solution fraud-detection
Result: One bucket with multiple solution folders:
edge-prod-b001-us-west-1-s3/
solutions/
master-solution/
customer-churn/
fraud-detection/
Verify Deployment:
# Check bucket exists
aws s3 ls s3://edge-prod-b001-us-west-1-s3/
# List solution folders
aws s3 ls s3://edge-prod-b001-us-west-1-s3/solutions/
# View master-solution structure
aws s3 ls s3://edge-prod-b001-us-west-1-s3/solutions/master-solution/ --recursive | head -20
# Check CloudFormation stack (if deployed via CloudFormation)
aws cloudformation describe-stacks --stack-name edge-prod-b001-us-west-1-s3-stack --region us-west-1 --query 'Stacks[0].StackStatus'
# Count total folders created
aws s3 ls s3://edge-prod-b001-us-west-1-s3/solutions/master-solution/ --recursive | grep '.gitkeep' | wc -l
Artifacts created:
/mlops-infra-suite/s3
βββ configs
βΒ Β βββ edge-prod-b001-us-west-1-s3.yaml
βββ policies
βΒ Β βββ edge-prod-b001-us-west-1-s3-iam-policy.json
βββ reports
βΒ Β βββ edge-prod-b001-us-west-1-s3-create-bucket-master-solution-20260228-020806-166.html
βΒ Β βββ edge-prod-b001-us-west-1-s3-create-policy-None-20260228-020700-792.log
βΒ Β βββ edge-prod-b001-us-west-1-s3-create-prov-template-master-solution-20260228-020702-061.log
βΒ Β βββ edge-prod-b001-us-west-1-s3-create-prov-template-master-solution-20260228-020702-154.html
βΒ Β βββ edge-prod-b001-us-west-1-s3-deploy-solution-customer-churn-20260228-020813-436.log
βΒ Β βββ edge-prod-b001-us-west-1-s3-deploy-solution-fraud-detection-20260228-020824-170.log
βΒ Β βββ edge-prod-b001-us-west-1-s3-prep-master-master-solution-20260228-020703-485.log
βΒ Β βββ edge-prod-b001-us-west-1-s3-prep-master-master-solution-20260228-020704-179.html
βΒ Β βββ edge-prod-b001-us-west-1-s3-validate-config-None-20260228-020659-555.log
βββ templates
βββ edge-prod-b001-us-west-1-s3_master-solution_s3_template.yaml
Pattern B: Dedicated Buckets (One Solution Per Bucket)ΒΆ
Use Case: Isolated deployment where each ML solution has its own dedicated S3 bucket.
Configuration: Separate config files with bucket_name_override for each solution
Example 1: edge-prod-b001-us-west-1-s3-customer-churn.yaml
client:
company_name: Edge Corp
company_prefix: edge
account_id: "123456789012"
tenant_id: "a001"
environment:
env: prod
region: us-west-1
s3:
bucket_name_override: "edge-prod-b001-us-west-1-s3-customer-churn"
versioning: true
lifecycle_policy: ml-optimized
Example 2: edge-prod-b001-us-west-1-s3-fraud-detection.yaml
client:
company_name: Edge Corp
company_prefix: edge
account_id: "123456789012"
tenant_id: "a001"
environment:
env: prod
region: us-west-1
s3:
bucket_name_override: "edge-prod-b001-us-west-1-s3-fraud-detection"
versioning: false
lifecycle_policy: development
Workflow for Customer Churn:
# 1. Validate and prepare
-v ~/.aws:/home/s3user/.aws:ro \
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-customer-churn-s3.yaml \
--action validate-config
-v ~/.aws:/home/s3user/.aws:ro \
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/policies:/app/policies \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-customer-churn-s3.yaml \
-v ~/.aws:/home/s3user/.aws:ro \
--action create-policy
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-customer-churn-s3.yaml \
--action create-prov-template \
--solution customer-churn
# 2. Create bucket with solution structure
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-customer-churn-s3.yaml \
--action create-bucket \
--solution customer-churn \
--force
-v ~/.aws:/home/s3user/.aws:ro
Workflow for Fraud Detection (repeat with fraud-detection config):
# 1. Validate and prepare
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
-v ~/.aws:/home/s3user/.aws:ro \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-fraud-detection-s3.yaml \
--action validate-config
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/policies:/app/policies \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-fraud-detection-s3.yaml \
--action create-policy
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-fraud-detection-s3.yaml \
--action create-prov-template \
--solution fraud-detection
# 2. Create bucket with solution structure
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-fraud-detection-s3.yaml \
--action create-bucket \
--solution fraud-detection \
--force
Result: Two separate buckets, each with one solution:
edge-prod-b001-us-west-1-s3-customer-churn/
solutions/
customer-churn/
edge-prod-b001-us-west-1-s3-fraud-detection/
solutions/
fraud-detection/
Verify Deployment:
# Check customer-churn bucket
aws s3 ls s3://edge-prod-b001-us-west-1-s3-customer-churn/solutions/
aws s3 ls s3://edge-prod-b001-us-west-1-s3-customer-churn/solutions/customer-churn/ --recursive | head -20
# Check fraud-detection bucket
aws s3 ls s3://edge-prod-b001-us-west-1-s3-fraud-detection/solutions/
aws s3 ls s3://edge-prod-b001-us-west-1-s3-fraud-detection/solutions/fraud-detection/ --recursive | head -20
# Check CloudFormation stacks
aws cloudformation describe-stacks --stack-name edge-prod-b001-us-west-1-s3-customer-churn-s3-stack --region us-west-1 --query 'Stacks[0].StackStatus' 2>/dev/null || echo "Customer-churn stack not found"
aws cloudformation describe-stacks --stack-name edge-prod-b001-us-west-1-s3-fraud-detection-s3-stack --region us-west-1 --query 'Stacks[0].StackStatus' 2>/dev/null || echo "Fraud-detection stack not found"
# Count folders in each bucket
aws s3 ls s3://edge-prod-b001-us-west-1-s3-customer-churn/ --recursive | grep '.gitkeep' | wc -l
aws s3 ls s3://edge-prod-b001-us-west-1-s3-fraud-detection/ --recursive | grep '.gitkeep' | wc -l
Artifacts created:
/mlops-infra-suite/s3
βββ configs
βΒ Β βββ edge-prod-b001-us-west-1-s3-customer-churn-s3.yaml
βΒ Β βββ edge-prod-b001-us-west-1-s3-fraud-detection-s3.yaml
βββ policies
βΒ Β βββ edge-prod-b001-us-west-1-s3-customer-churn-iam-policy.json
βΒ Β βββ edge-prod-b001-us-west-1-s3-fraud-detection-iam-policy.json
βββ reports
βΒ Β βββ edge-prod-b001-us-west-1-s3-customer-churn-create-bucket-customer-churn-20260228-023020-467.log
βΒ Β βββ edge-prod-b001-us-west-1-s3-customer-churn-create-bucket-customer-churn-20260228-023021-194.html
βΒ Β βββ edge-prod-b001-us-west-1-s3-customer-churn-create-bucket-customer-churn-20260228-023123-188.html
βΒ Β βββ edge-prod-b001-us-west-1-s3-customer-churn-create-policy-None-20260228-022949-415.log
βΒ Β βββ edge-prod-b001-us-west-1-s3-customer-churn-create-prov-template-customer-churn-20260228-022958-989.log
βΒ Β βββ edge-prod-b001-us-west-1-s3-customer-churn-create-prov-template-customer-churn-20260228-022959-083.html
βΒ Β βββ edge-prod-b001-us-west-1-s3-customer-churn-validate-config-None-20260228-022941-327.log
βΒ Β βββ edge-prod-b001-us-west-1-s3-fraud-detection-create-bucket-fraud-detection-20260228-024112-852.log
βΒ Β βββ edge-prod-b001-us-west-1-s3-fraud-detection-create-bucket-fraud-detection-20260228-024113-544.html
βΒ Β βββ edge-prod-b001-us-west-1-s3-fraud-detection-create-bucket-fraud-detection-20260228-024215-593.html
βΒ Β βββ edge-prod-b001-us-west-1-s3-fraud-detection-create-policy-None-20260228-024042-486.log
βΒ Β βββ edge-prod-b001-us-west-1-s3-fraud-detection-create-prov-template-fraud-detection-20260228-024102-231.log
βΒ Β βββ edge-prod-b001-us-west-1-s3-fraud-detection-create-prov-template-fraud-detection-20260228-024102-324.html
βΒ Β βββ edge-prod-b001-us-west-1-s3-fraud-detection-validate-config-None-20260228-024035-244.log
βββ templates
βββ edge-prod-b001-us-west-1-s3-customer-churn_customer-churn_s3_template.yaml
βββ edge-prod-b001-us-west-1-s3-fraud-detection_fraud-detection_s3_template.yaml
Pattern Comparison:
Aspect |
Pattern A (Shared) |
Pattern B (Dedicated) |
|---|---|---|
Cost |
Lower (one bucket) |
Higher (multiple buckets) |
Isolation |
Shared resources |
Complete isolation |
Management |
Simpler (one stack) |
More complex (multiple stacks) |
Permissions |
Shared IAM policies |
Granular per-bucket policies |
Use Case |
Dev/test, cost-sensitive |
Production, compliance, multi-tenant |
ConfigurationΒΆ
Configuration File StructureΒΆ
The S3 Provisioner uses a three-section YAML configuration:
client:
company_name: string # Full company name
company_prefix: string # Short identifier (lowercase)
account_id: "string" # 12-digit AWS account ID (quoted)
tenant_id: "string" # 4-character alphanumeric alias (e.g., "a001", "b002")
environment:
env: string # Environment: prod, dev, test, staging
region: string # AWS region
s3:
bucket_name_override: string # Custom bucket name (optional)
versioning: boolean # Enable versioning (true/false)
lifecycle_policy: string # Lifecycle profile: ml-optimized, compliance, development, none
vpc_id: string # VPC ID for endpoint (optional)
route_table_ids: string # Comma-separated route table IDs (optional)
tags: # Custom tags (optional)
key: value
For detailed configuration reference, see CONFIGURATION.md.
Tenant ID FormatΒΆ
The tenant_id field uses a 4-character alphanumeric format to reduce resource name length:
Format: ^[a-z0-9]{4}$
Examples:
a001- First accountb002- Second accountc003- Third accountx999- Custom identifier
Benefits:
Saves 8 characters compared to 12-digit account IDs
Prevents Lambda function name length issues (64-char AWS limit)
Maintains uniqueness within organization
Human-readable and memorable
Bucket Name OverrideΒΆ
The auto-generated bucket name format is: <company-prefix>-<env>-<account-alias>-<region>
Example: edge-prod-b001-us-west-1-s3
Use bucket_name_override when you need:
Multiple dedicated buckets per solution (Pattern B)
Custom naming for compliance requirements
Legacy naming compatibility
Example with Override:
s3:
bucket_name_override: "edge-prod-b001-us-west-1-s3-customer-churn"
Artifact Naming ReferenceΒΆ
All artifacts created by the S3 Provisioner follow consistent naming conventions. The bucket name serves as the universal stem for all generated artifacts, ensuring global uniqueness and consistency.
Artifact Type |
Naming Pattern |
Derived From |
Example |
|---|---|---|---|
AWS Resources |
|||
S3 Bucket |
|
Config: Auto-generated |
|
S3 Bucket (Override) |
|
Config: |
|
CloudFormation Stack |
|
Bucket name + |
|
Lambda Function |
|
Stack name + |
|
Local Files |
|||
Config File (Pattern A) |
|
User-defined (note: |
|
Config File (Pattern B) |
|
User-defined (note: |
|
CloudFormation Template |
|
Bucket name |
|
IAM Policy |
|
Bucket name |
|
Log File |
|
Bucket name |
|
HTML Report (Template) |
|
Bucket name |
|
HTML Report (Deployment) |
|
Bucket name |
|
S3 Objects |
|||
Solution Folder |
|
Solution parameter |
|
Folder Marker |
|
Solution + folder path |
|
Uploaded Template |
|
Solution + template name |
|
Key Naming Components:
{bucket_name}: Either auto-generated from config values or frombucket_name_override. This is the universal stem for all local artifacts, ensuring global uniqueness (guaranteed by AWS S3 bucket naming constraints){stack_name}: Always{bucket_name}-s3-stack(adds-s3-stacksuffix to bucket name){solution}: Solution name from--solutionparameter (e.g.,master-solution,customer-churn){timestamp}: UTC timestamp in formatYYYYMMDD_HHMMSS_mmm(e.g.,20260226_211116_934){action}: Command action (e.g.,create-bucket,deploy-solution,validate-config)
Naming Best Practices:
Config Files: Use descriptive names that include environment and purpose
Pattern A (shared):
{company_prefix}-{env}-{tenant_id}-{region}-s3.yamlPattern B (dedicated):
{company_prefix}-{env}-{tenant_id}-{region}-{solution}-s3.yaml
Bucket Names: Keep
company_prefixshort (max 10 chars) to avoid AWS 63-character bucket name limitSolution Names: Use lowercase with hyphens (e.g.,
customer-churn, notCustomerChurn)Consistency: All artifacts for a deployment share the same
{bucket_name}prefix, making them easy to identify and manage
Note on Early Initialization Errors: If the configuration file cannot be found, contains YAML syntax errors, or fails validation, these errors will only appear in console/stderr output (not in log files) because the bucket name cannot be determined until the configuration is successfully loaded and validated.
Commands ReferenceΒΆ
Validation CommandsΒΆ
validate-configΒΆ
Validates configuration file against schema.
Purpose: Verify YAML syntax and schema compliance before deployment
Requirements: No AWS credentials needed
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action validate-config
Output:
Logs:
reports/{bucket_name}-validate-config-None_{timestamp}.log
Policy CommandsΒΆ
create-policyΒΆ
Generates IAM policy JSON file for S3 bucket and CloudFormation operations.
Purpose: Create IAM policy document with required permissions
Requirements: No AWS credentials needed (generates file only)
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/policies:/app/policies \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action create-policy
Use Case: Attach generated policy to IAM user/role before running AWS operations
CloudFormation Template CommandsΒΆ
create-prov-templateΒΆ
Generates CloudFormation template for S3 bucket provisioning.
Purpose: Create infrastructure-as-code template for bucket deployment
Requirements: No AWS credentials needed (generates file only)
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action create-prov-template \
--solution master-solution
Output: templates/<config-name>_<solution>_s3_template.yaml
Template Includes:
S3 bucket with configured properties
Versioning configuration
Lifecycle rules (if lifecycle_policy specified)
Tags (system + custom)
VPC endpoint (if vpc_id specified)
Lambda function for folder creation
validate-prov-templateΒΆ
Validates the generated CloudFormation template locally without making any AWS calls.
Purpose: Catch template errors early β YAML syntax, required keys, and internal reference integrity
Requirements: No AWS credentials needed
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action validate-prov-template \
--solution master-solution
Output: Report in console (template size, resource count, output count)
What It Checks:
YAML syntax is valid
Required top-level keys present (
AWSTemplateFormatVersion,Resources)All
!Refand!GetAtttargets resolve to parameters or resources within the template
Note: If the template file does not exist, it will be generated automatically.
Change Preview and Drift Detection CommandsΒΆ
show-changesΒΆ
Previews what would change in the deployed stack without applying any modifications.
Purpose: Review pending changes before updating infrastructure
Requirements: AWS credentials, deployed stack
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action show-changes \
--solution master-solution
Output: Report in console (list of pending resource changes)
How It Works:
Creates a temporary CloudFormation ChangeSet against the deployed stack
Displays pending resource changes (Add, Modify, Remove)
Deletes the ChangeSet after display (no changes applied)
Dry Run: Use --dry-run to simulate without creating a ChangeSet:
s3-prov -con edge-prod-b001-us-west-1-s3.yaml -act show-changes --dry-run
check-driftΒΆ
Detects infrastructure drift between the deployed stack and its template.
Purpose: Identify resources that have been modified outside of CloudFormation
Requirements: AWS credentials, deployed stack
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action check-drift
Output: Report in console (per-resource drift status with property-level diffs)
What It Detects:
Bucket policies modified manually
Lifecycle rules changed outside CloudFormation
Tags modified or removed
VPC endpoint configurations altered
Dry Run: Use --dry-run to simulate without initiating drift detection:
s3-prov -con edge-prod-b001-us-west-1-s3.yaml -act check-drift --dry-run
test-deployΒΆ
Deploys S3 infrastructure with a random test suffix for safe testing.
Purpose: Validate that the template deploys successfully without affecting production resources
Requirements: AWS credentials, --solution
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action test-deploy \
--solution master-solution
How It Works:
Appends a random 6-character suffix to the bucket name (e.g.,
edge-prod-b001-us-west-1-s3-test-a1b2c3)All resource names inherit the test suffix β no collision with production
Generates template in-memory (not persisted to disk)
Deploys via CloudFormation with the test-suffixed stack name
Restores original bucket name after deployment
Cleanup: Delete the test stack when done:
aws cloudformation delete-stack --stack-name edge-prod-b001-us-west-1-s3-test-a1b2c3-stack --region us-west-1
Note: This action does not require --force. Test deployments are designed to be disposable. Custom bucket_name_override values must not exceed 51 characters to stay within the S3 63-character limit after the test suffix is appended.
Bucket Provisioning CommandsΒΆ
create-bucketΒΆ
Output:
Files:
templates/{bucket_name}_{solution}_s3_template.yamlReports:
reports/{bucket_name}-create-bucket-{solution}_{timestamp}.htmlLogs:
reports/{bucket_name}-create-bucket-{solution}_{timestamp}.logAWS Resources: S3 bucket
{bucket_name}, CloudFormation stack{bucket_name}-s3-stackSSM Parameters:
/s3/<bucket-name>/<OutputKey>
Creates S3 bucket using CloudFormation stack.
Purpose: Deploy S3 bucket infrastructure via CloudFormation
Requirements: AWS credentials with S3 and CloudFormation permissions
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action create-bucket \
--solution master-solution \
--force
What It Does:
Generates and saves CloudFormation template to templates directory
Creates CloudFormation stack
Provisions S3 bucket with all configurations
Applies tags, versioning, lifecycle rules
Creates VPC endpoint (if configured)
Stores stack outputs in SSM Parameter Store (
/s3/<bucket-name>/<OutputKey>)e.g.,
/s3/globalbank-prod-c001-us-west-2-s3/BucketNameAvailable keys:
BucketName,BucketArn,VPCEndpointId,TotalFolders,CompanyPrefix,Region
Stack Name: {bucket-name}-stack
Example: edge-prod-b001-us-west-1-s3-stack
prep-masterΒΆ
Prepares S3 bucket with master-solution folder structure.
Purpose: Create master folder template for ML solutions
Requirements: AWS credentials, bucket must exist
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
-v $(pwd)/s3/templates:/app/templates \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action prep-master \
--solution master-solution \
--force
What It Does:
Generates CloudFormation template
Creates S3 bucket from the template
Stores stack outputs in SSM Parameter Store (
/s3/<bucket-name>/<OutputKey>)e.g.,
/s3/globalbank-prod-c001-us-west-2-s3/BucketNameAvailable keys:
BucketName,BucketArn,VPCEndpointId,TotalFolders,CompanyPrefix,Region
Adds
.gitkeepfiles to all folders in the master-solution structure
What It Creates:
solutions/master-solution/
data/
raw/
curated/
processed/
inference/
models/
notebooks/
artifacts/
code/
config/
Output:
reports/{bucket_name}-prep-master-{solution}_{timestamp}.htmlreports/{bucket_name}-prep-master-{solution}_{timestamp}.logS3 folder structure under
s3://{bucket_name}/solutions/{solution}/SSM Parameters:
/s3/<bucket-name>/<OutputKey>
delete-bucketΒΆ
Output:
Logs:
reports/{bucket_name}_delete-bucket_None_{timestamp}.log
Deletes S3 bucket directly (bypasses CloudFormation).
Purpose: Remove bucket and all objects (emergency cleanup)
Requirements: AWS credentials, --force flag
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action delete-bucket \
--force
Warning: This empties and deletes the bucket directly. Use tear-down for proper CloudFormation cleanup.
delete-cfn-stackΒΆ
Output:
Logs:
reports/{bucket_name}_delete-cfn-stack_None_{timestamp}.log
Deletes CloudFormation stack (keeps bucket if deletion protection enabled).
Purpose: Remove CloudFormation stack
Requirements: AWS credentials, --force flag
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action delete-cfn-stack \
--force
tear-downΒΆ
Output:
Logs:
reports/{bucket_name}_tear-down_None_{timestamp}.log
Deletes CloudFormation stack with all underlying S3 resources.
Purpose: Complete infrastructure cleanup (stack + bucket + objects)
Requirements: AWS credentials, --force flag
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action tear-down \
--force
Recommended: Use this for clean infrastructure removal instead of delete-bucket.
Cost Estimation CommandsΒΆ
cost-trafficΒΆ
Generate a usage assumptions file for cost estimation.
Purpose: Create an editable YAML file with default monthly usage values for storage, requests, data transfer, and VPC Endpoint traffic
Requirements: AWS credentials, provisioning template must exist
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action cost-traffic \
--solution master-solution
Output:
File
configs/<bucket-name>-usage.yamlFile
reports/<bucket-name>-cost-traffic-{solution}_{timestamp}.log
Generated Usage File:
# Auto-generated S3 usage assumptions for cost estimation
# Edit values to match your expected monthly usage
usage:
storage:
storage_class: Standard
data_gb: 100
requests:
put_requests_per_month: 10000
get_requests_per_month: 50000
transfer:
data_out_gb_per_month: 10
vpc_endpoint:
S3VPCEndpoint:
type: AWS::EC2::VPCEndpoint
data_gb_per_month: 50
Note: Edit the values to match your expected usage before running cost-estimate. Requires the provisioning template β run create-prov-template first.
cost-estimateΒΆ
Calculate estimated monthly infrastructure costs with a detailed breakdown.
Purpose: Produce a cost breakdown showing storage, request, data transfer, and VPC Endpoint costs with monthly and annual totals
Requirements: AWS credentials, provisioning template and usage file must exist
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action cost-estimate \
--solution master-solution
Output:
Cost breakdown in console
File
reports/<bucket-name>-cost-estimate-{timestamp}.htmlFile
reports/<bucket-name>-cost-estimate-{solution}_{timestamp}.log
HTML Report Includes:
Cost summary with monthly and annual totals
Cost breakdown table (storage, requests, transfer, VPC Endpoint)
Usage assumptions and per-unit rates
Pricing source and region information
Note: Requires both the provisioning template and usage assumptions file. Run create-prov-template and cost-traffic first. Edit the usage file and re-run to model different scenarios.
cost-refresh-pricesΒΆ
Refresh S3 resource pricing from the AWS Pricing API.
Purpose: Update the built-in pricing data with the latest on-demand rates from AWS for all regions
Requirements: AWS credentials with Pricing API access
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action cost-refresh-prices \
--solution master-solution
Output:
Report in console (region count, pricing for current region)
File
reports/<bucket-name>-cost-refresh-prices-{solution}_{timestamp}.log
What It Fetches:
S3 Standard storage per-GB rates
PUT/COPY/POST/LIST request rates
GET and other request rates
Data transfer out per-GB rates
VPC Endpoint hourly and per-GB rates
Note: This action is optional. The tool ships with pre-loaded pricing data. Run this action periodically to ensure pricing accuracy.
Solution Deployment CommandsΒΆ
deploy-solutionΒΆ
Output:
Logs:
reports/{bucket_name}-deploy-solution-{solution}_{timestamp}.logS3 Objects: Folder structure under
s3://{bucket_name}/solutions/{solution}/
Deploys ML solution folder structure to bucket.
Purpose: Create solution-specific folder structure
Requirements: AWS credentials, bucket must exist
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action deploy-solution \
--solution customer-churn
Available Solutions:
customer-churndemand-forecastingfraud-detectionmaster-solution
deploy-foldersΒΆ
Output:
Logs:
reports/{bucket_name}_deploy-folders_{solution}_{timestamp}.logS3 Objects: Cloned folder structure under
s3://{bucket_name}/solutions/{solution}/
Clones S3 folder structures from master-solution to specific solution.
Purpose: Copy folder structure from master template
Requirements: AWS credentials, master-solution must exist
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action deploy-folders \
--solution customer-churn
upload-templateΒΆ
Output:
Logs:
reports/{bucket_name}_upload-template_{solution}_{timestamp}.logS3 Objects:
s3://{bucket_name}/solutions/{solution}/templates/{bucket_name}_{solution}_s3_template.yaml
Uploads YAML template to S3 location.
Purpose: Store CloudFormation template in S3 for reference
Requirements: AWS credentials, bucket must exist
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action upload-template \
--solution master-solution
Upload Location: s3://<bucket>/solutions/<solution>/templates/
GitKeep Management CommandsΒΆ
gitkeep-fullΒΆ
Output:
Logs:
reports/{bucket_name}_gitkeep-full_{solution}_{timestamp}.logS3 Objects:
.gitkeepfiles in all folders unders3://{bucket_name}/solutions/{solution}/
Creates .gitkeep files for all folders under solutions/
Purpose: Preserve empty folder structure in Git
Requirements: AWS credentials, solution must exist
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action gitkeep-full \
--solution customer-churn
gitkeep-noneΒΆ
Output:
Logs:
reports/{bucket_name}_gitkeep-none_{solution}_{timestamp}.log
Removes .gitkeep files from all folders in solutions/
Purpose: Clean up all .gitkeep files
Requirements: AWS credentials, solution must exist
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action gitkeep-none \
--solution customer-churn
gitkeep-partialΒΆ
Output:
Logs:
reports/{bucket_name}_gitkeep-partial_{solution}_{timestamp}.log
Removes .gitkeep files from folders two levels below root.
Purpose: Selective .gitkeep cleanup
Requirements: AWS credentials, solution must exist
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action gitkeep-partial \
--solution customer-churn
purge-bucketΒΆ
Output:
Logs:
reports/{bucket_name}_purge-bucket_None_{timestamp}.log
Removes all .gitkeep files from the entire bucket.
Purpose: Complete .gitkeep cleanup across all solutions
Requirements: AWS credentials, bucket must exist
Command:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action purge-bucket \
--force
Common WorkflowsΒΆ
Pattern B: Dedicated Bucket DeploymentΒΆ
Deploy a separate bucket for each ML solution.
# Step 1: Validate configuration
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-customer-churn-s3.yaml \
--action validate-config
# Step 2: Generate IAM policy
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/policies:/app/policies \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-customer-churn-s3.yaml \
--action create-policy
# Step 3: Generate CloudFormation template
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-customer-churn-s3.yaml \
--action create-prov-template \
--solution customer-churn
# Step 4: Create dedicated bucket with solution structure
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-customer-churn-s3.yaml \
--action create-bucket \
--solution customer-churn \
--force
# Step 5: Create .gitkeep files for all nodes
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3-customer-churn-s3.yaml \
--action gitkeep-full \
--solution customer-churn
# Repeat steps 1-4 for each additional solution with its own config file
Quick Test WorkflowΒΆ
# Validate only
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action validate-config
# Generate template only (no AWS deployment)
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action create-prov-template \
--solution master-solution
# Review generated template
cat templates/edge-prod-b001-us-west-1-s3_master-solution_s3_template.yaml
Cleanup WorkflowΒΆ
# Complete teardown (recommended)
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action tear-down \
--force
Cost Estimation WorkflowΒΆ
# Step 1: Generate CloudFormation template (if not already done)
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action create-prov-template \
--solution master-solution
# Step 2: Generate usage assumptions with defaults
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action cost-traffic \
--solution master-solution
# Step 3: Edit usage assumptions to match expected usage
# Edit configs/edge-prod-b001-us-west-1-s3-usage.yaml
# Step 4: Calculate cost estimate
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action cost-estimate \
--solution master-solution
# Step 5: View HTML report
# Open reports/edge-prod-b001-us-west-1-s3-cost-estimate-*.html
# Optional: Refresh pricing data from AWS Pricing API
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action cost-refresh-prices \
--solution master-solution
Volume MountsΒΆ
Mount |
Purpose |
Required For |
|---|---|---|
|
Input configuration files and usage assumptions |
All actions |
|
Generated IAM policies |
create-policy |
|
Execution logs and HTML reports |
All actions (recommended) |
|
CloudFormation templates |
create-prov-template, prep-master, upload-template, cost-traffic, cost-estimate |
|
AWS credentials |
All actions (required for license validation) |
Example with all mounts:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/policies:/app/policies \
-v $(pwd)/s3/reports:/app/reports \
-v $(pwd)/s3/templates:/app/templates \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action prep-master \
--solution master-solution \
--force
AWS CredentialsΒΆ
Option 1: AWS Profile (Recommended)ΒΆ
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action create-bucket \
--solution master-solution \
--force
Option 2: Environment VariablesΒΆ
docker run --rm \
-e AWS_ACCESS_KEY_ID=AKIA... \
-e AWS_SECRET_ACCESS_KEY=wJal... \
-e AWS_DEFAULT_REGION=us-west-1 \
-v $(pwd)/s3/configs:/app/configs \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action create-bucket \
--solution master-solution \
--force
Option 3: IAM Role (EC2/ECS)ΒΆ
When running on EC2 or ECS with IAM role attached, no credentials needed:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs \
s3-provisioner:latest \
--config edge-prod-b001-us-west-1-s3.yaml \
--action create-bucket \
--solution master-solution \
--force
Best PracticesΒΆ
Always Validate FirstΒΆ
Run validate-config before any AWS operations:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config your-config.yaml \
--action validate-config
Review Generated TemplatesΒΆ
Always review CloudFormation templates before deployment:
# Generate template
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/templates:/app/templates \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config your-config.yaml \
--action create-prov-template \
--solution master-solution
# Review template
cat templates/your-config_master-solution_s3_template.yaml
Use IAM Roles in ProductionΒΆ
Prefer IAM roles over access keys when running on EC2/ECS.
Enable Versioning in ProductionΒΆ
s3:
versioning: true # Always enable for production
Use Lifecycle PoliciesΒΆ
s3:
lifecycle_policy: ml-optimized # or compliance, development
Test in Dev FirstΒΆ
Test configurations in dev environment before production:
environment:
env: dev # Test here first
region: us-west-2
Use CloudFormation for InfrastructureΒΆ
Always use create-bucket (CloudFormation) instead of manual bucket creation.
Proper CleanupΒΆ
Use tear-down instead of delete-bucket for proper infrastructure cleanup:
docker run --rm \
-v ~/.aws:/home/s3user/.aws:ro \
-v $(pwd)/s3/configs:/app/configs:ro \
-v $(pwd)/s3/reports:/app/reports \
s3-provisioner:latest \
--config your-config.yaml \
--action tear-down \
--force
Monitor LogsΒΆ
Always mount reports/ directory to capture execution logs:
-v $(pwd)/s3/reports:/app/reports
Version Control ConfigurationsΒΆ
Store configuration files in Git for change tracking and rollback capability.
Command SummaryΒΆ
Action |
AWS Creds |
βsolution |
βforce |
Purpose |
|---|---|---|---|---|
validate-config |
β |
β |
β |
Validate YAML |
create-policy |
β |
β |
β |
Generate IAM policy |
create-prov-template |
β |
β |
β |
Generate CFN template |
validate-prov-template |
β |
β |
β |
Validate CFN template locally |
show-changes |
β |
β |
β |
Preview pending changes |
check-drift |
β |
β |
β |
Detect infrastructure drift |
test-deploy |
β |
β |
β |
Safe test deployment |
create-bucket |
β |
β |
β |
Create S3 bucket |
prep-master |
β |
β |
β |
Create master folders |
deploy-solution |
β |
β |
β |
Deploy solution folders |
deploy-folders |
β |
β |
β |
Clone folder structure |
upload-template |
β |
β |
β |
Upload template to S3 |
gitkeep-full |
β |
β |
β |
Add .gitkeep files |
gitkeep-none |
β |
β |
β |
Remove .gitkeep files |
gitkeep-partial |
β |
β |
β |
Partial .gitkeep removal |
purge-bucket |
β |
β |
β |
Remove all .gitkeep |
delete-bucket |
β |
β |
β |
Delete bucket directly |
delete-cfn-stack |
β |
β |
β |
Delete CFN stack |
tear-down |
β |
β |
β |
Complete cleanup |
cost-traffic |
β |
β |
β |
Generate usage assumptions |
cost-estimate |
β |
β |
β |
Calculate cost estimate |
cost-refresh-prices |
β |
β |
β |
Refresh resource pricing |
TroubleshootingΒΆ
See TROUBLESHOOTING.md for common issues and solutions.
Whatβs NextΒΆ
After provisioning your S3 infrastructure, explore these resources:
Enterprise Implementation:
GOVERNANCE_COMPLIANCE.md - Reference architecture for implementing governance, compliance, and audit capabilities with ready-to-use JSON schemas and AWS service recommendations
Structure Reference:
S3_FOLDERS.md - Complete technical reference for the provisioned folder structure with detailed hierarchy at all levels
Additional Documentation:
ML_LIFECYCLE_POLICIES.md - Detailed lifecycle policy configurations for cost optimization
CONFIGURATION.md - Complete YAML configuration reference
IAM_PERMISSIONS.md - Required AWS permissions and security best practices
SupportΒΆ
See SUPPORT.md for help and contact information.
Configuration ReferenceΒΆ
See CONFIGURATION.md for complete configuration documentation.
IAM PermissionsΒΆ
See IAM_PERMISSIONS.md for required AWS permissions.
Copyright Β© 2025 Axon Tech Labs All rights reserved.
See LICENSE.txt for terms and conditions.
Frequently Asked QuestionsΒΆ
Q: Can I modify the generated CloudFormation template?
A: Yes, but changes will be overwritten on next create-prov-template. Use configuration parameters (bucket_name_override, lifecycle_policy, tags, etc.) instead.
Q: How do I upgrade to a new version? A: Pull the latest Docker image. Existing buckets and data are not affected unless you redeploy.
Q: Can I use this with existing S3 buckets? A: No, this tool creates new buckets via CloudFormation. For existing buckets, continue managing them with AWS Console or CLI.
Q: What happens if deployment fails? A: CloudFormation automatically rolls back all resources. Check stack events for details. See TROUBLESHOOTING.md.
Q: Can I deploy to multiple regions? A: Yes, create separate configuration files for each region and run the tool for each config.
Q: How much does the infrastructure cost?
A: Use the built-in cost estimation feature to calculate costs for your specific configuration. Run cost-traffic to generate usage assumptions, edit the values to match your expected storage, requests, and data transfer, then run cost-estimate for a detailed breakdown with monthly and annual totals. S3 buckets are free to create β costs come from storage, requests, and data transfer. VPC Endpoints incur an hourly charge. See COST_OPTIMIZATION.md for optimization strategies.
Q: Can I have multiple solutions in one bucket?
A: Yes β thatβs Pattern A (Shared Bucket). Use deploy-solution to add solution folder structures to an existing bucket.
Q: Whatβs the difference between prep-master and create-bucket?
A: prep-master creates the bucket via CloudFormation and deploys the master folder structure. create-bucket creates a dedicated bucket for a single solution (Pattern B).
Q: How do I delete everything?
A: Use tear-down --force to delete the CloudFormation stack, bucket, and all objects.