Migration GuideΒΆ
Guide for onboarding existing ML infrastructure to the ML Provisioner, and for upgrading between tiers.
Table of ContentsΒΆ
Migrating from Manual SetupΒΆ
The ML Provisioner does not import or manage existing AWS resources. Migration from a manually provisioned ML setup means standing up a new ML Provisioner-managed stack alongside the existing infrastructure, validating it, and cutting over.
Step 1: Inventory Existing ResourcesΒΆ
Collect the data elements needed to populate a configuration file. For each environment and scenario you intend to migrate:
AWS account ID and region
VPC ID, subnet IDs, and security group ID (enterprise tier)
SSM Parameter Store paths if VPC integration uses
vpc_source: ssmUse case and workload names β these will drive the
ml_nameand all resource namesSource control preference: CodeCommit or S3
Step 2: Create Configuration FilesΒΆ
Create one configuration file per environment and scenario. Use the naming convention described in Naming Conventions:
{company_prefix}-{env}-{tenant_id}-{region}-{use_case}-{workload}-ml-{scenario}.yaml
See Configuration Reference and Configuration Guide for all available fields.
Step 3: Validate and TestΒΆ
CONFIG=globalbank-prod-c001-us-west-2-demand-forecasting-ml-codecommit-sgprov-ssm.yaml
IMAGE=enterprise-1.0.0
# Validate configuration
docker run --rm \
-v ~/.aws:/home/mluser/.aws:ro \
-v $(pwd)/ml/configs:/app/configs:ro \
-v $(pwd)/ml/reports:/app/reports \
ml-provisioner:${IMAGE} -con ${CONFIG} -act validate-config
# Generate and review CloudFormation template
docker run --rm \
-v ~/.aws:/home/mluser/.aws:ro \
-v $(pwd)/ml/configs:/app/configs:ro \
-v $(pwd)/ml/templates:/app/templates \
-v $(pwd)/ml/reports:/app/reports \
ml-provisioner:${IMAGE} -con ${CONFIG} -act create-prov-template
docker run --rm \
-v ~/.aws:/home/mluser/.aws:ro \
-v $(pwd)/ml/configs:/app/configs:ro \
-v $(pwd)/ml/templates:/app/templates \
-v $(pwd)/ml/reports:/app/reports \
ml-provisioner:${IMAGE} -con ${CONFIG} -act create-review-report
# Test deployment β catches naming conflicts and permission issues
docker run --rm \
-v ~/.aws:/home/mluser/.aws:ro \
-v $(pwd)/ml/configs:/app/configs:ro \
-v $(pwd)/ml/reports:/app/reports \
ml-provisioner:${IMAGE} -con ${CONFIG} -act test-deploy
test-deploy creates an isolated stack with a random suffix. It will fail if there are
naming conflicts with existing resources, misconfigured VPC references, or IAM permission
gaps. Delete the test stack when done:
aws cloudformation delete-stack --stack-name <test-stack-name> --region us-west-2
aws cloudformation wait stack-delete-complete --stack-name <test-stack-name> --region us-west-2
Step 4: DeployΒΆ
Once test-deploy passes, follow the full provisioning sequence in User Guide to
deploy the production stack.
Step 5: CutoverΒΆ
Cutover is outside the scope of the ML Provisioner. The MLOps team is responsible for:
Migrating pipeline definitions and model registry entries to the new stack
Updating downstream services to consume the new SSM Parameter Store paths
Decommissioning old resources once the new stack is fully validated
Upgrading TiersΒΆ
What Each Tier AddsΒΆ
Each higher tier is additive β it includes everything from the lower tier plus additional resources:
Resource |
Starter |
Professional |
Enterprise |
|---|---|---|---|
SageMaker Model Registry |
β |
β |
β |
CodeCommit Repositories (Γ2) |
β |
β |
β |
CodeBuild Projects (Γ2) |
β |
β |
β |
CodePipeline Pipelines (Γ2) |
β |
β |
β |
IAM Roles (Γ3) |
β |
β |
β |
S3 Artifacts Bucket |
β |
β |
β |
EventBridge Rule (model approval trigger) |
β |
β |
|
CloudWatch Dashboard |
β |
β |
|
IAM Managed Policies (Γ2) |
β |
β |
|
KMS Key + Alias |
β |
||
IAM Permission Boundary |
β |
||
CloudWatch Log Group (compliance) |
β |
||
CloudWatch Alarms (Γ2) |
β |
||
SNS Topic + Subscription (alerts) |
β |
||
VPC Endpoints (Γ4) |
β |
||
Endpoint Security Group |
β (standalone mode) |
Tier Upgrade ProcedureΒΆ
Upgrading tiers requires deploying a new stack from the higher-tier Docker image. The existing lower-tier stack continues to run in parallel until cutover is complete.
Step 1: Create a new configuration file for the higher tier
Start from the existing lower-tier config as a base. Copy it and update the filename to reflect the new tier context:
cp ml/configs/globalbank-prod-c001-us-west-2-demand-forecasting-ml-codecommit-sgprov-ssm.yaml \
ml/configs/globalbank-prod-c001-us-west-2-demand-forecasting-ent-ml-codecommit-sgprov-ssm.yaml
Step 2: Update the new config with tier-specific fields
The higher-tier Docker image requires additional config fields. For enterprise tier, add:
vpc_integration:
mode: sg-provisioner # or standalone
vpc_source: ssm # or direct
vpc_parameter_store_path: /vpc/globalbank-prod/VpcId
subnet_parameter_store_path: /vpc/globalbank-prod/SubnetIds
sg_parameter_store_path: /sg/globalbank-prod/SecurityGroupId # sg-provisioner mode only
route_table_ids: [] # required for S3 Gateway endpoint
compliance:
log_retention_days: 90
alerts:
alerts_email: mlops-alerts@globalbank.com
See Configuration Reference for the full field reference per tier.
Step 3: Set a different workload value to avoid naming collisions
The ml_name is derived from config fields including workload. Since both the old and
new stacks will coexist in the same AWS account and region, they must have different
ml_name values to avoid resource naming conflicts. Use workload as the differentiator:
# Lower-tier config (existing)
workload: demand-forecasting
# Higher-tier config (new)
workload: demand-forecasting-ent
This produces distinct ml_name values:
globalbank-prod-c001-us-west-2-demand-forecasting-ml(existing stack)globalbank-prod-c001-us-west-2-demand-forecasting-ent-ml(new stack)
And distinct SSM paths:
/ml/globalbank-prod-c001-us-west-2-demand-forecasting-ml/.../ml/globalbank-prod-c001-us-west-2-demand-forecasting-ent-ml/...
Step 4: Validate and test
CONFIG=globalbank-prod-c001-us-west-2-demand-forecasting-ent-ml-codecommit-sgprov-ssm.yaml
IMAGE=enterprise-1.0.0
docker run --rm \
-v ~/.aws:/home/mluser/.aws:ro \
-v $(pwd)/ml/configs:/app/configs:ro \
-v $(pwd)/ml/reports:/app/reports \
ml-provisioner:${IMAGE} -con ${CONFIG} -act validate-config
docker run --rm \
-v ~/.aws:/home/mluser/.aws:ro \
-v $(pwd)/ml/configs:/app/configs:ro \
-v $(pwd)/ml/reports:/app/reports \
ml-provisioner:${IMAGE} -con ${CONFIG} -act test-deploy
Step 5: Deploy the higher-tier stack
Follow the full provisioning sequence in User Guide using the new config file and the higher-tier Docker image.
Avoiding Naming CollisionsΒΆ
The ml_name is constructed from:
{company_prefix}-{environment}-{tenant_id}-{region}-{use_case}-{workload}-ml
Tier is not part of the name. Two stacks with identical config fields but different tiers
will have the same ml_name and collide. Always set a distinct workload value in the
higher-tier config when running stacks in parallel.
After CutoverΒΆ
Once the higher-tier stack is fully validated and all downstream services have been updated to consume its SSM paths, decommission the lower-tier stack:
OLD_CONFIG=globalbank-prod-c001-us-west-2-demand-forecasting-ml-codecommit-sgprov-ssm.yaml
OLD_IMAGE=starter-1.0.0
docker run --rm \
-v ~/.aws:/home/mluser/.aws:ro \
-v $(pwd)/ml/configs:/app/configs:ro \
-v $(pwd)/ml/reports:/app/reports \
ml-provisioner:${OLD_IMAGE} -con ${OLD_CONFIG} -act delete-product --force
Warning: Verify no downstream services are still consuming the old stackβs SSM parameters before running
delete-product. See Update Procedures for the full implications of stack deletion.
Region MigrationΒΆ
Deploying to a different AWS region is a new deployment, not a migration. The region is
part of the ml_name and all resource names β there are no naming collisions with an
existing stack in another region.
Create a new configuration file for the target region, follow the full provisioning sequence in User Guide, and manage cutover independently.