Jenkins shared library and CDK constructs for AWS infrastructure. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
17 KiB
ECS Cluster Deployment
Deploy production-ready ECS clusters with AWS CDK.
Features
- EC2 Capacity Provider with managed scaling (replaces custom SchedulableContainers metric)
- Mixed Instances Policy for Spot support (replaces Autospotting)
- Launch Templates with IMDSv2 and gp3 EBS volumes
- Instance Draining via lifecycle hooks for graceful task migration
- Optional Fargate capacity providers for serverless workloads
- Internal/External ALBs with HTTPS support
- Container Insights for monitoring
- Automatic instance refresh via max instance lifetime
Quick Start
Minimal Jenkinsfile - Using CloudFormation Imports
Minimal props: Only vpcStackName required. All VPC details auto-import from VPC stack exports.
@Library(["spicy-automation@main"]) _
spicyECSCluster(
jenkinsAwsCredentialsId: "aws-credentials",
region: "ca-central-1",
stackName: "my-ecs-cluster",
vpcStackName: "my-vpc", // Auto-imports ALL VPC details (VPC ID, CIDR, subnets, AZs)
ownerTag: "MyTeam",
productTag: "my-product",
componentTag: "ecs-cluster",
environment: "dev"
)
What auto-imports from VPC stack:
- VPC ID from
${vpcStackName}-VPCID - VPC CIDR from
${vpcStackName}-VPCCIDR - Number of AZs from
${vpcStackName}-NumberOfAZs - Private subnet IDs from
${vpcStackName}-PrivateSubnetA1ID,${vpcStackName}-PrivateSubnetB1ID, etc. - Public subnet IDs from
${vpcStackName}-PublicSubnetAID,${vpcStackName}-PublicSubnetBID, etc. (ifcreateExternalLoadBalancer: true) - Availability zones auto-derived from region and number of AZs
Production Jenkinsfile with All Options
@Library(["spicy-automation@main"]) _
spicyECSCluster(
// AWS Configuration
jenkinsAwsCredentialsId: "aws-credentials",
region: "ca-central-1",
accountId: "123456789012",
stackName: "prod-ecs-cluster",
// VPC Configuration - only vpcStackName required, all VPC details auto-import
vpcStackName: "production-vpc",
// VPC ID, CIDR, subnets, AZs, and numberOfAzs all auto-import from VPC stack exports
// Tags
ownerTag: "Platform",
productTag: "spicy",
componentTag: "ecs-cluster",
environment: "prod",
// Instance Configuration
instanceType: "m5a.xlarge",
additionalInstanceTypes: "m5.xlarge,m5d.xlarge,m5n.xlarge",
keyName: "my-keypair",
ebsVolumeSize: 100,
// Scaling
minClusterSize: 3,
maxClusterSize: 10,
targetCapacityPercent: 100,
// Spot Configuration (for cost savings)
spotEnabled: true,
onDemandPercentage: 50, // 50% On-Demand, 50% Spot
spotAllocationStrategy: "capacity-optimized",
// Load Balancers
createExternalLoadBalancer: true,
createInternalLoadBalancer: true,
certificateArn: "arn:aws:acm:ca-central-1:123456789012:certificate/xxx",
// Fargate (optional hybrid - enables both FARGATE and FARGATE_SPOT)
enableFargate: false,
// Timeouts
drainingTimeout: 900, // 15 minutes for task draining
maxInstanceLifetime: 604800, // 7 days for instance refresh
// Container Insights
containerInsights: true,
// Approval for production
approvers: "admin,platform-team"
)
Parameters Reference
Required Parameters
| Parameter | Description | Example |
|---|---|---|
jenkinsAwsCredentialsId |
Jenkins credential ID for AWS | "aws-credentials" |
region |
AWS region | "ca-central-1" |
stackName |
CloudFormation stack name | "my-ecs-cluster" |
vpcStackName |
VPC stack name - required. All VPC details (VPC ID, CIDR, subnets, AZs) auto-import from VPC stack exports | "my-vpc" |
ownerTag |
Owner tag value | "MyTeam" |
productTag |
Product tag value | "my-product" |
Instance Configuration
| Parameter | Default | Description |
|---|---|---|
instanceType |
m5a.large |
Primary EC2 instance type |
additionalInstanceTypes |
- | Additional types for Spot diversity |
keyName |
- | EC2 key pair for SSH access |
ebsVolumeSize |
100 |
EBS volume size in GB |
containerInsights |
true |
Enable Container Insights |
Scaling Configuration
| Parameter | Default | Description |
|---|---|---|
minClusterSize |
2 |
Minimum number of instances |
maxClusterSize |
4 |
Maximum number of instances |
targetCapacityPercent |
100 |
Target utilization for managed scaling |
Spot Configuration
| Parameter | Default | Description |
|---|---|---|
spotEnabled |
false |
Enable Spot instances |
onDemandPercentage |
100 |
Percentage of On-Demand (rest is Spot) |
spotAllocationStrategy |
capacity-optimized |
Spot allocation strategy |
Spot Allocation Strategies:
capacity-optimized- Best for interruption avoidance (recommended)lowest-price- Best for cost, higher interruption riskcapacity-optimized-prioritized- Prioritizes instance types you specify
Load Balancer Configuration
| Parameter | Default | Description |
|---|---|---|
createExternalLoadBalancer |
false |
Create internet-facing ALB (public subnets auto-imported from VPC stack if enabled) |
createInternalLoadBalancer |
false |
Create internal ALB |
certificateArn |
- | ACM certificate for HTTPS |
Fargate Configuration
| Parameter | Default | Description |
|---|---|---|
enableFargate |
false |
Enable Fargate capacity providers (adds both FARGATE and FARGATE_SPOT) |
Lifecycle Configuration
| Parameter | Default | Description |
|---|---|---|
drainingTimeout |
900 |
Seconds to wait for task draining |
maxInstanceLifetime |
604800 |
Max instance age (7 days) |
Environment-Specific Configuration
Development/Sandbox
spicyECSCluster(
// ... base config ...
environment: "dev",
minClusterSize: 1,
maxClusterSize: 2,
spotEnabled: true,
onDemandPercentage: 0, // 100% Spot for max savings
)
Staging
spicyECSCluster(
// ... base config ...
environment: "staging",
minClusterSize: 2,
maxClusterSize: 4,
spotEnabled: true,
onDemandPercentage: 20, // 80% Spot
)
Production
spicyECSCluster(
// ... base config ...
environment: "prod",
minClusterSize: 3,
maxClusterSize: 10,
spotEnabled: true,
onDemandPercentage: 50, // 50% On-Demand baseline
approvers: "admin,platform-team"
)
Stack Outputs
The stack exports these values for use by ECS services:
| Output | Export Name | Description |
|---|---|---|
ClusterName |
{stackName}-cluster-name |
ECS cluster name |
ClusterArn |
{stackName}-cluster-arn |
ECS cluster ARN |
VPC |
{stackName}-VPC |
VPC ID |
ECSHostSecurityGroup |
{stackName}-ecs-host-security-group |
EC2 security group |
AutoScalingGroupName |
{stackName}-auto-scaling-group |
ASG name |
ExternalLoadBalancerDNS |
{stackName}-internet-facing-url |
External ALB DNS |
ExternalLoadBalancerArn |
{stackName}-internet-facing-arn |
External ALB ARN |
ExternalHTTPListenerArn |
{stackName}-internet-facing-http-listener |
HTTP listener ARN |
ExternalHTTPSListenerArn |
{stackName}-internet-facing-https-listener |
HTTPS listener ARN |
InternalLoadBalancerDNS |
{stackName}-internal-url |
Internal ALB DNS |
InternalLoadBalancerArn |
{stackName}-internal-arn |
Internal ALB ARN |
InternalHTTPListenerArn |
{stackName}-internal-http-listener |
HTTP listener ARN |
InternalHTTPSListenerArn |
{stackName}-internal-https-listener |
HTTPS listener ARN |
LogsBucketName |
{stackName}-logs-s3-bucket |
ALB access logs bucket |
How It Works
Capacity Providers (Replaces Custom Scaling)
The cluster uses ECS Managed Scaling via Capacity Providers:
┌─────────────────────────────────────────────────────────┐
│ ECS Cluster │
├─────────────────────────────────────────────────────────┤
│ Capacity Providers: │
│ ┌─────────────────────────────────────────────────┐ │
│ │ EC2 Capacity Provider │ │
│ │ - Managed Scaling: ON │ │
│ │ - Target Capacity: 100% │ │
│ │ - Min Scaling Step: 1 │ │
│ │ - Max Scaling Step: 10000 │ │
│ └─────────────────────────────────────────────────┘ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ FARGATE (optional) │ │
│ └─────────────────────────────────────────────────┘ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ FARGATE_SPOT (optional) │ │
│ └─────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
This replaces the legacy SchedulableContainers Lambda metric with AWS-native scaling.
Mixed Instances Policy (Replaces Autospotting)
When spotEnabled: true:
┌─────────────────────────────────────────────────────────┐
│ Auto Scaling Group │
├─────────────────────────────────────────────────────────┤
│ Mixed Instances Policy: │
│ ┌─────────────────────────────────────────────────┐ │
│ │ On-Demand Base Capacity: 0 │ │
│ │ On-Demand % Above Base: 50% │ │
│ │ Spot Allocation: capacity-optimized │ │
│ └─────────────────────────────────────────────────┘ │
│ │
│ Instance Type Overrides: │
│ - m5a.large (primary) │
│ - m5.large │
│ - m5d.large │
│ - m5n.large │
└─────────────────────────────────────────────────────────┘
Benefits over Autospotting:
- No Lambda to maintain
- Faster response (no polling delay)
- Better capacity data (AWS-native)
- Simpler architecture
Instance Draining
Two-layer draining for zero-downtime:
-
Spot Interruption Draining (native ECS):
ECS_ENABLE_SPOT_INSTANCE_DRAINING=trueECS agent drains tasks on 2-minute Spot termination notice.
-
Lifecycle Hook Draining (Lambda):
- ASG sends termination event to SNS
- Lambda sets instance to DRAINING
- Waits for running tasks to migrate
- Completes lifecycle action
Launch Template Features
- IMDSv2 Required: Enhanced metadata security
- gp3 EBS Volumes: Better performance, lower cost than gp2
- Encrypted Volumes: EBS encryption enabled
- SSM Agent: Pre-installed for Session Manager access
Migrating from Legacy Automation
Parameter Mapping
| Legacy (Ansible) | New (CDK) |
|---|---|
stackName |
stackName |
instanceType |
instanceType |
minClusterSize |
minClusterSize |
maxClusterSize |
maxClusterSize |
spotEnabled |
spotEnabled |
minOnDemandPercentage |
onDemandPercentage |
largestContainerCPUReservation |
(not needed - managed scaling) |
largestContainerMemoryReservation |
(not needed - managed scaling) |
clusterScaleUpAdjustment |
(not needed - managed scaling) |
clusterScaleDownAdjustment |
(not needed - managed scaling) |
Removed Features
These legacy features are no longer needed:
- SchedulableContainers Lambda: Replaced by Capacity Provider managed scaling
- Autospotting: Replaced by Mixed Instances Policy
- Launch Configurations: Replaced by Launch Templates
- gp2 volumes: Upgraded to gp3
- IMDSv1: Now requires IMDSv2
Troubleshooting
Instances Not Joining Cluster
Check the ECS agent logs:
docker logs ecs-agent
cat /var/log/ecs/ecs-agent.log
Verify cluster name in user data:
cat /etc/ecs/ecs.config
Tasks Not Draining
Check Lambda logs in CloudWatch:
/aws/lambda/{stackName}-DrainingLambda
Spot Interruptions
Monitor with CloudWatch metrics:
AWS/EC2Spot→InterruptionRateAWS/ECS→CPUReservation,MemoryReservation
Consider increasing onDemandPercentage for critical workloads.
Cost Optimization Tips
- Use Spot in non-prod:
onDemandPercentage: 0 - Multiple instance types: Better Spot availability
- Right-size instances: Match to your container sizes
- Enable Fargate Spot: For batch/background tasks
- Set max instance lifetime: Force instance refresh for patches