Migration GuideΒΆ

Guide for migrating existing S3 infrastructure to be managed by the S3 Provisioner.

Table of ContentsΒΆ

Migrating from Manual S3 SetupΒΆ

Step 1: Inventory Existing BucketsΒΆ

Export your current S3 configuration:

# List all buckets
aws s3 ls

# Export bucket configuration
BUCKET="your-existing-bucket"
aws s3api get-bucket-versioning --bucket $BUCKET
aws s3api get-bucket-lifecycle-configuration --bucket $BUCKET 2>/dev/null
aws s3api get-bucket-tagging --bucket $BUCKET 2>/dev/null
aws s3api get-bucket-policy --bucket $BUCKET 2>/dev/null
aws s3api get-bucket-encryption --bucket $BUCKET 2>/dev/null

# Count objects and size
aws s3 ls s3://$BUCKET --recursive --summarize | tail -2

Step 2: Create Equivalent ConfigurationΒΆ

Map your existing bucket settings to an S3 Provisioner YAML configuration:

client:
  company_name: Your Company
  company_prefix: yourco
  account_id: "123456789012"
  tenant_id: "a001"

environment:
  env: prod
  region: us-west-1

s3:
  bucket_name_override: ""          # Auto-generate, or use existing name
  versioning: true                   # Match existing versioning setting
  lifecycle_policy: ml-optimized     # Choose closest match to existing rules
  vpc_id: ""
  route_table_ids: ""
  tags:
    Environment: production
    MigratedFrom: manual

Step 3: Decide on Migration StrategyΒΆ

Option A: New Bucket (Recommended)

  • Create a new bucket with the provisioner

  • Copy data from old bucket to new bucket

  • Update applications to use new bucket name

  • Decommission old bucket

Option B: Import Existing Bucket

  • Use bucket_name_override with your existing bucket name

  • Deploy CloudFormation stack to manage the bucket

  • Note: CloudFormation cannot import existing buckets directly β€” you’ll need to create a new stack

Step 4: Validate ConfigurationΒΆ

docker run --rm \
  -v ~/.aws:/home/s3user/.aws:ro \
  -v $(pwd)/s3/configs:/app/configs:ro \
  -v $(pwd)/s3/reports:/app/reports \
  s3-provisioner:latest \
  --config yourco-prod-a001-us-west-1-s3.yaml \
  --action validate-config

Step 5: Deploy New BucketΒΆ

# Generate template and review
docker run --rm \
  -v ~/.aws:/home/s3user/.aws:ro \
  -v $(pwd)/s3/configs:/app/configs:ro \
  -v $(pwd)/s3/templates:/app/templates \
  -v $(pwd)/s3/reports:/app/reports \
  s3-provisioner:latest \
  --config yourco-prod-a001-us-west-1-s3.yaml \
  --action create-prov-template \
  --solution master-solution

# Deploy
docker run --rm \
  -v ~/.aws:/home/s3user/.aws:ro \
  -v $(pwd)/s3/configs:/app/configs:ro \
  -v $(pwd)/s3/templates:/app/templates \
  -v $(pwd)/s3/reports:/app/reports \
  s3-provisioner:latest \
  --config yourco-prod-a001-us-west-1-s3.yaml \
  --action prep-master \
  --solution master-solution \
  --force

# Deploy solution folder structure
docker run --rm \
  -v ~/.aws:/home/s3user/.aws:ro \
  -v $(pwd)/s3/configs:/app/configs:ro \
  -v $(pwd)/s3/reports:/app/reports \
  s3-provisioner:latest \
  --config yourco-prod-a001-us-west-1-s3.yaml \
  --action deploy-solution \
  --solution customer-churn

Step 6: Migrate DataΒΆ

Copy data from old bucket to new bucket, mapping to the provisioned folder structure:

NEW_BUCKET="yourco-prod-a001-us-west-1-s3"
SOLUTION="customer-churn"

# Copy training data
aws s3 sync s3://old-bucket/training-data/ \
  s3://$NEW_BUCKET/solutions/$SOLUTION/data/raw/

# Copy models
aws s3 sync s3://old-bucket/models/ \
  s3://$NEW_BUCKET/solutions/$SOLUTION/models/training/

# Copy notebooks
aws s3 sync s3://old-bucket/notebooks/ \
  s3://$NEW_BUCKET/solutions/$SOLUTION/notebooks/

# Verify
aws s3 ls s3://$NEW_BUCKET/solutions/$SOLUTION/ --recursive | wc -l

Step 7: Update ApplicationsΒΆ

Update all applications, scripts, and configurations to reference the new bucket:

# Before
BUCKET = "old-bucket"
DATA_PATH = "training-data/"

# After
BUCKET = "yourco-prod-a001-us-west-1-s3"
DATA_PATH = "solutions/customer-churn/data/raw/"

Step 8: Decommission Old BucketΒΆ

After all applications are updated and validated:

# Verify old bucket is no longer accessed (check CloudTrail or S3 access logs)
# Then empty and delete
aws s3 rm s3://old-bucket --recursive
aws s3 rb s3://old-bucket

Migrating from TerraformΒΆ

Step 1: Export Terraform StateΒΆ

terraform state show aws_s3_bucket.main
terraform state show aws_s3_bucket_versioning.main
terraform state show aws_s3_bucket_lifecycle_configuration.main

Step 2: Map to YAML ConfigurationΒΆ

Terraform

S3 Provisioner YAML

aws_s3_bucket.bucket

s3.bucket_name_override

aws_s3_bucket_versioning.status

s3.versioning

aws_s3_bucket_lifecycle_configuration

s3.lifecycle_policy

Step 3: Deploy and MigrateΒΆ

Follow Steps 4-8 from the manual migration above.

Step 4: Remove from Terraform StateΒΆ

terraform state rm aws_s3_bucket.main
terraform state rm aws_s3_bucket_versioning.main
terraform state rm aws_s3_bucket_lifecycle_configuration.main

Migrating from AWS CDKΒΆ

Step 1: Review Synthesized TemplateΒΆ

cdk synth > existing-template.yaml

Step 2: Map to YAML ConfigurationΒΆ

Review the CloudFormation template and map bucket properties to S3 Provisioner YAML.

Step 3: Deploy and MigrateΒΆ

Follow Steps 4-8 from the manual migration above.

Migrating Existing DataΒΆ

Small Datasets (< 100 GB)ΒΆ

Use aws s3 sync for straightforward copying:

aws s3 sync s3://old-bucket/ s3://new-bucket/solutions/customer-churn/ \
  --storage-class STANDARD

Large Datasets (100 GB - 5 TB)ΒΆ

Use S3 Batch Operations or multi-part sync:

# Parallel sync with increased concurrency
aws configure set default.s3.max_concurrent_requests 50
aws s3 sync s3://old-bucket/ s3://new-bucket/solutions/customer-churn/

Very Large Datasets (> 5 TB)ΒΆ

Consider AWS DataSync for managed, high-performance transfer:

aws datasync create-task \
  --source-location-arn arn:aws:datasync:us-west-1:123456789012:location/loc-old \
  --destination-location-arn arn:aws:datasync:us-west-1:123456789012:location/loc-new

Data ValidationΒΆ

After migration, verify data integrity:

# Compare object counts
echo "Old bucket:"
aws s3 ls s3://old-bucket --recursive --summarize | tail -2

echo "New bucket:"
aws s3 ls s3://new-bucket/solutions/customer-churn/ --recursive --summarize | tail -2

Rollback ProceduresΒΆ

CloudFormation Automatic RollbackΒΆ

If deployment fails, CloudFormation automatically rolls back:

aws cloudformation describe-stack-events \
  --stack-name yourco-prod-a001-us-west-1-s3-stack \
  --query 'StackEvents[?ResourceStatus==`CREATE_FAILED`].[LogicalResourceId,ResourceStatusReason]' \
  --output table

Manual RollbackΒΆ

If you need to revert after a successful deployment:

# Delete the new bucket and stack
docker run --rm \
  -v ~/.aws:/home/s3user/.aws:ro \
  -v $(pwd)/s3/configs:/app/configs:ro \
  -v $(pwd)/s3/reports:/app/reports \
  s3-provisioner:latest \
  --config yourco-prod-a001-us-west-1-s3.yaml \
  --action tear-down \
  --force

Your old bucket remains untouched β€” the S3 Provisioner only manages resources it created.

Application RollbackΒΆ

If applications were updated to use the new bucket:

  1. Revert application configurations to old bucket name

  2. Verify old bucket data is still intact

  3. Delete new bucket when ready