Configuration GuideΒΆ

Table of ContentsΒΆ


OverviewΒΆ

This guide is your starting point for deploying the ML Provisioner. It walks you through selecting the correct configuration file for your environment and verifying that all prerequisite infrastructure is in place before you run a single Docker command.

Use this guide when:

  • You are deploying the ML Provisioner for the first time

  • You are onboarding a new client or environment

  • You are unsure which of the 20 configuration files matches your setup

What you will end up with:

  • The correct config file identified and populated for your scenario

  • Confirmation that all prerequisite infrastructure is in place

  • Readiness to run the full deployment sequence in USER_GUIDE.md

The decision tree has 4 steps:

  1. Select your tier (starter / professional / enterprise)

  2. Verify prerequisites and select your config file

  3. Populate the config file with your values

  4. Validate and deploy

If you already know your config file and prerequisites are in place, skip to Step 3 β€” Populate your config file.


Step 1 β€” Which tier have you purchased?ΒΆ


Step 2A β€” StarterΒΆ

No VPC integration. No prerequisite infrastructure required.

What you get: SageMaker Model Registry, CodeCommit repositories (model-build, model-deploy) or S3 source, CodeBuild projects (build, deploy), CodePipeline pipelines (build-pipeline, deploy-pipeline), IAM roles, SSM Parameter Store outputs. Ideal for small teams and proof-of-concept projects.

What is your source control?

Answer

Config file

CodeCommit (no workload)

{prefix}-codecommit.yaml

CodeCommit (with workload)

{prefix}-codecommit-workload.yaml

S3 (no workload)

{prefix}-s3.yaml

S3 (with workload)

{prefix}-s3-workload.yaml

{prefix} = {company}-{env}-{tenant}-{region}-{use_case}-ml

Workload variant: use this when you need multiple ML products for the same use case in the same environment (e.g. realtime vs batch). The workload field is appended to ml_name to keep resources unique.

Image: ml-provisioner:starter

➑️ Step 3 β€” Populate your config file


Step 2B β€” ProfessionalΒΆ

No VPC integration. No prerequisite infrastructure required.

What you get: All Starter resources plus S3 artifacts bucket, EventBridge rule for automated pipeline triggers, CloudWatch dashboard, and IAM managed policies (build, deploy). Production-ready with enhanced monitoring and event-driven automation. Ideal for production workloads that don’t require VPC isolation.

What is your source control?

Answer

Config file

CodeCommit (no workload)

{prefix}-codecommit.yaml

CodeCommit (with workload)

{prefix}-codecommit-workload.yaml

S3 (no workload)

{prefix}-s3.yaml

S3 (with workload)

{prefix}-s3-workload.yaml

{prefix} = {company}-{env}-{tenant}-{region}-{use_case}-ml

Image: ml-provisioner:professional

➑️ Step 3 β€” Populate your config file


Step 2C β€” Enterprise PrerequisitesΒΆ

What you get: All Professional resources plus KMS encryption, VPC endpoints (SageMaker API, SageMaker Runtime, S3, STS), EC2 Security Group (standalone mode), CloudWatch compliance log group, metric filters, CloudWatch alarms, SNS topic and subscription for security alerts, and IAM permission boundary policy. Designed for regulated industries and enterprise workloads requiring VPC isolation, encryption at rest, and compliance monitoring.

Enterprise tier requires a VPC. Before selecting a config file, verify the following.

Prerequisite Check 1 β€” VPCΒΆ

Is your VPC deployed and available?

VPC_NAME={your-vpc-name}       # e.g. globalbank-prod-c001-us-west-2-vpc
VPC_STACK_NAME=${VPC_NAME}-stack
AWS_REGION={your-aws-region}   # e.g. us-west-2

aws cloudformation describe-stacks \
  --stack-name ${VPC_STACK_NAME} \
  --region ${AWS_REGION} \
  --query 'Stacks[0].StackStatus' \
  --output text

Answer

Action

βœ… Yes β€” VPC exists and is available

Continue to Prerequisite Check 2

❌ No β€” VPC does not exist

STOP. Deploy your VPC using vpc-provisioner before continuing.

Prerequisite Check 2 β€” VPC SourceΒΆ

How will you supply VPC ID and subnet IDs to the config?

Answer

vpc_source value

What to verify

From SSM Parameter Store (vpc-provisioner populated them)

parameter-store

Verify the SSM paths exist β€” see below

Hardcoded directly in the config file

direct

You have the VPC ID and subnet IDs ready

If using SSM (parameter-store): verify the paths exist:

VPC_NAME={your-vpc-name}       # e.g. globalbank-prod-c001-us-west-2-vpc
VPC_STACK_NAME=${VPC_NAME}-stack
AWS_REGION={your-aws-region}   # e.g. us-west-2

aws ssm get-parameters-by-path \
  --path /vpc/${VPC_NAME}/ \
  --region ${AWS_REGION} \
  --query 'Parameters[*].Name' \
  --output table

Expected output must include:

  • /vpc/${VPC_NAME}/VPCId

  • /vpc/${VPC_NAME}/PrivateSubnetIds

Answer

Action

βœ… Both parameters present

Continue to Prerequisite Check 3

❌ Parameters missing

STOP. Run vpc-provisioner to populate SSM, or switch to vpc_source: direct.

Prerequisite Check 3 β€” VPC Mode (SG Provisioner)ΒΆ

Who manages the Security Group for your VPC endpoints?

Answer

vpc_mode value

Action

SG Provisioner manages it

sg-provisioner

Verify SG stack exists β€” see below

ML Provisioner creates its own

standalone

No check needed β€” continue to Step 2D

If using sg-provisioner: verify the SG stack and SSM params exist:

SGPROV_NAME={your-sgprov-name}          # e.g. globalbank-prod-c001-us-west-2-sg
SGPROV_STACK_NAME=${SGPROV_NAME}-stack  # e.g. globalbank-prod-c001-us-west-2-sg-stack
AWS_REGION={your-aws-region}            # e.g. us-west-2

aws cloudformation describe-stacks \
  --stack-name ${SGPROV_STACK_NAME} \
  --region ${AWS_REGION} \
  --query 'Stacks[0].StackStatus' \
  --output text

aws ssm get-parameters-by-path \
  --path /sg/${SGPROV_NAME}/ \
  --region ${AWS_REGION} \
  --query 'Parameters[*].Name' \
  --output table

Answer

Action

βœ… SG stack is CREATE_COMPLETE and SSM params exist

Continue to Step 2D

❌ SG stack does not exist or SSM params missing

STOP. Deploy your SG stack using sg-provisioner before continuing.


Step 2D β€” Enterprise Config SelectionΒΆ

Answer the following four questions, then find your config file in the table below.

Question

Options

Source control

codecommit / s3

VPC mode

sgprov / standalone

VPC source

ssm / direct

Route table IDs needed?

no / yes (only for direct vpc_source)

Workload variant?

no / yes

Enterprise Config MatrixΒΆ

SC

VPC mode

VPC source

RTB

Workload

Config file suffix

codecommit

sgprov

ssm

β€”

no

-codecommit-sgprov-ssm.yaml

codecommit

sgprov

direct

β€”

no

-codecommit-sgprov-direct.yaml

codecommit

standalone

ssm

β€”

no

-codecommit-standalone-ssm.yaml

codecommit

standalone

ssm

β€”

yes

-codecommit-standalone-ssm-workload.yaml

codecommit

standalone

direct

no

no

-codecommit-standalone-direct.yaml

codecommit

standalone

direct

yes

no

-codecommit-standalone-direct-rtb.yaml

s3

sgprov

ssm

β€”

no

-s3-sgprov-ssm.yaml

s3

sgprov

direct

β€”

no

-s3-sgprov-direct.yaml

s3

standalone

ssm

β€”

no

-s3-standalone-ssm.yaml

s3

standalone

ssm

β€”

yes

-s3-standalone-ssm-workload.yaml

s3

standalone

direct

no

no

-s3-standalone-direct.yaml

s3

standalone

direct

yes

no

-s3-standalone-direct-rtb.yaml

RTB (route_table_ids): only relevant for vpc_source: direct. Set to yes if you want the S3 Gateway VPC endpoint route associations configured automatically at deploy time. If no, your networking team manages route table associations manually.

Workload variant: RTB + workload combined is not currently a supported combination. Use the workload variant only with vpc_source: ssm.

Full config filename = {company}-{env}-{tenant}-{region}-{use_case}-ml + suffix above.

Image: ml-provisioner:enterprise

➑️ Step 3 β€” Populate your config file


Step 3 β€” Populate your config fileΒΆ

Copy the matching example config from ml/configs/ and update the fields with your values. Full example configs per tier are shown below.

Starter ExampleΒΆ

client:
  company_name: TechCorp
  company_prefix: techcorp
  account_id: "123456789012"
  tenant_id: "a001"

environment:
  env: prod
  region: us-west-2

ml_product:
  use_case: customer-churn
  tier: starter
  source_control: codecommit   # or s3
  product_name_override: ""    # leave empty to auto-generate ml_name
  workload: ""                 # leave empty unless using workload variant

tags:
  cost_center: ML Platform
  project: Customer Churn Prediction
  owner: ml-engineering-team

Professional ExampleΒΆ

client:
  company_name: Edge Analytics Corp
  company_prefix: edge
  account_id: "123456789012"
  tenant_id: "b001"

environment:
  env: prod
  region: us-west-2

ml_product:
  use_case: fraud-detection
  tier: professional
  source_control: codecommit   # or s3
  product_name_override: ""
  workload: ""                 # leave empty unless using workload variant
  log_retention_days: 90       # optional β€” minimum 90

tags:
  cost_center: Fraud Operations
  project: Real-time Fraud Detection System
  owner: fraud-ml-engineering-team

Enterprise Example (standalone + SSM)ΒΆ

client:
  company_name: Global Bank
  company_prefix: globalbank
  account_id: "123456789012"
  tenant_id: "c001"

environment:
  env: prod
  region: us-west-2

ml_product:
  use_case: demand-forecasting
  tier: enterprise
  source_control: codecommit   # or s3
  alerts_email: ml-alerts@globalbank.com
  product_name_override: ""
  workload: ""                 # leave empty unless using workload variant
  log_retention_days: 365      # optional β€” minimum 90, increase for compliance (PCI-DSS/SOC2: 365, HIPAA: 2190)
  vpc_integration:
    mode: standalone           # or sg-provisioner
    vpc_source: parameter-store
    vpc_parameter_store_path: /vpc/globalbank-prod-c001-us-west-2-vpc/VPCId
    subnet_parameter_store_path: /vpc/globalbank-prod-c001-us-west-2-vpc/PrivateSubnetIds

tags:
  cost_center: Risk Management
  project: Demand Forecasting Platform
  owner: ml-platform-team

Field ReferenceΒΆ

All TiersΒΆ

Field

Location

Description

company_name

client.company_name

Your company display name

company_prefix

client.company_prefix

Short lowercase prefix used in resource names

account_id

client.account_id

AWS account ID (12 digits)

tenant_id

client.tenant_id

Tenant identifier (e.g. c001)

env

environment.env

Environment name (e.g. prod, dev)

region

environment.region

AWS region (e.g. us-west-2)

use_case

ml_product.use_case

ML use case name β€” used in resource naming only

tier

ml_product.tier

Must match your purchased tier

source_control

ml_product.source_control

codecommit or s3

workload

ml_product.workload

Leave empty "" unless using workload variant

Enterprise Only β€” alertsΒΆ

Field

Location

Description

alerts_email

ml_product.alerts_email

Email for SNS security alerts β€” enterprise tier only

Enterprise Only β€” vpc_source: parameter-storeΒΆ

Field

Description

vpc_parameter_store_path

SSM path for VPC ID (e.g. /vpc/{name}/VPCId)

subnet_parameter_store_path

SSM path for subnet IDs (e.g. /vpc/{name}/PrivateSubnetIds)

Enterprise Only β€” vpc_source: directΒΆ

Field

Description

vpc_id

VPC ID (e.g. vpc-0abc1234)

subnet_ids

List of private subnet IDs

route_table_ids

List of route table IDs β€” leave [] if not needed

S3 Source Control OnlyΒΆ

Field

Description

s3_prefix

S3 bucket/prefix path for pipeline source artifacts


Step 4 β€” Validate and deployΒΆ

Once your config is populated, refer to USER_GUIDE.md for the complete 12-step deployment sequence.


Quick Reference β€” Config File by ScenarioΒΆ

Starter (ml-provisioner:starter)ΒΆ

Scenario

Config file

CodeCommit

{prefix}-codecommit.yaml

CodeCommit + workload

{prefix}-codecommit-workload.yaml

S3

{prefix}-s3.yaml

S3 + workload

{prefix}-s3-workload.yaml

Professional (ml-provisioner:professional)ΒΆ

Scenario

Config file

CodeCommit

{prefix}-codecommit.yaml

CodeCommit + workload

{prefix}-codecommit-workload.yaml

S3

{prefix}-s3.yaml

S3 + workload

{prefix}-s3-workload.yaml

Enterprise (ml-provisioner:enterprise)ΒΆ

Scenario

Config file

CodeCommit + sgprov + SSM

{prefix}-codecommit-sgprov-ssm.yaml

CodeCommit + sgprov + direct

{prefix}-codecommit-sgprov-direct.yaml

CodeCommit + standalone + SSM

{prefix}-codecommit-standalone-ssm.yaml

CodeCommit + standalone + SSM + workload

{prefix}-codecommit-standalone-ssm-workload.yaml

CodeCommit + standalone + direct

{prefix}-codecommit-standalone-direct.yaml

CodeCommit + standalone + direct + rtb

{prefix}-codecommit-standalone-direct-rtb.yaml

S3 + sgprov + SSM

{prefix}-s3-sgprov-ssm.yaml

S3 + sgprov + direct

{prefix}-s3-sgprov-direct.yaml

S3 + standalone + SSM

{prefix}-s3-standalone-ssm.yaml

S3 + standalone + SSM + workload

{prefix}-s3-standalone-ssm-workload.yaml

S3 + standalone + direct

{prefix}-s3-standalone-direct.yaml

S3 + standalone + direct + rtb

{prefix}-s3-standalone-direct-rtb.yaml

{prefix} = {company}-{env}-{tenant}-{region}-{use_case}-ml