Files
spicy-automation/docs/plans/2026-02-14-dual-stack-vpc-egress-optimization-design.md
Ryan Wilson 2e02b02023 Add dual-stack IPv6 VPC with egress optimization and VPC endpoints
- Add Amazon-provided IPv6 /56 CIDR block with auto-carved /64 per subnet
- Add Egress-Only Internet Gateway for free IPv6 outbound from private subnets
- Add IPv6 routes: public subnets via IGW, private subnets via EOIGW
- Add IPv6 NACL entries for subnet tier 2
- Add DynamoDB gateway endpoint (free, alongside existing S3)
- Add 6 interface endpoints: ECR, ECR Docker, CloudWatch Logs, STS,
  Secrets Manager, SSM with shared security group
- Add enableIpv6 prop (default true) and interfaceEndpoints config
- Update VPC stack with context params for all new features
- Include design doc and implementation plan

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 17:18:18 -08:00

3.8 KiB

Dual-Stack VPC with Egress Optimization

Date: 2026-02-14 Status: Approved

Problem

The VPC is IPv4-only. All private subnet egress routes through NAT Gateways, incurring:

  • $0.045/hr per gateway (~$128/mo for 4 AZs)
  • $0.045/GB data processing for all traffic through NAT

Most of this traffic is AWS-to-AWS (ECR pulls, CloudWatch logs, S3) and could bypass NAT entirely.

Solution

Add IPv6 dual-stack support and VPC endpoints to the SpicyVpc construct.

Traffic flow after changes

Private subnet egress paths (in priority order):
1. VPC Endpoints   → AWS services (S3, ECR, DynamoDB, etc.) — free or ~$0.01/hr
2. EOIGW (IPv6)    → IPv6-capable internet destinations — free
3. NAT Gateway     → IPv4-only internet destinations — $0.045/GB (fallback only)

Inbound to private subnets:
- ALB (public subnet) → private subnet via target group (unchanged)
- No direct internet access (private IPs + EOIGW is outbound-only)

Security model

Private subnets remain private. The Egress-Only Internet Gateway is a stateful firewall:

  • Allows outbound IPv6 connections and their return traffic
  • Drops all unsolicited inbound IPv6 traffic
  • No address translation (unlike NAT) — instances use globally routable IPv6 but are not reachable from the internet

The ALB in public subnets remains the only ingress path to private services.

Changes to spicy-vpc.ts

New props

/** Enable IPv6 dual-stack (adds Amazon-provided IPv6 CIDR, EOIGW, IPv6 routes) @default true */
readonly enableIpv6?: boolean;

/** VPC Interface Endpoints to create. @default full ECS set when createVpcEndpoints is true */
readonly vpcInterfaceEndpoints?: VpcInterfaceEndpointConfig;

Infrastructure additions

  1. VPC IPv6 CIDRCfnVPCCidrBlock with amazonProvidedIpv6CidrBlock: true (assigns a /56)
  2. Subnet IPv6 CIDRs — Each subnet gets a /64 carved from the VPC's /56, auto-calculated by AZ index
  3. Egress-Only Internet GatewayCfnEgressOnlyInternetGateway attached to the VPC
  4. Public subnet IPv6 route::/0 via IGW (already exists, just needs the route)
  5. Private subnet IPv6 routes::/0 via EOIGW on all private route tables
  6. DynamoDB Gateway Endpoint — Free, added alongside existing S3 endpoint
  7. Interface Endpoints — ECR API, ECR Docker, CloudWatch Logs, STS, Secrets Manager, SSM
    • Each gets a security group allowing HTTPS (443) from the VPC CIDR
    • Deployed into private subnet 1 across all AZs
    • Private DNS enabled (transparent to applications)

What does NOT change

  • NAT Gateways (kept as IPv4 fallback, unchanged config)
  • Subnet IPv4 CIDRs
  • All existing CloudFormation logical IDs
  • ALB, ECS cluster, and ECS service constructs
  • Cross-stack export names and values

New outputs

  • EgressOnlyInternetGatewayId — for cross-stack reference if needed
  • VpcIpv6CidrBlock — the assigned /56

Cost impact (4 AZ deployment)

Component Before After
NAT Gateway hourly $128/mo $128/mo (unchanged)
NAT data processing $0.045/GB (all traffic) Near zero (IPv4-only fallback)
S3 endpoint Free Free
DynamoDB endpoint N/A Free
Interface endpoints (6) N/A ~$43/mo
IPv6 egress N/A Free

Net savings depend on traffic volume. At 1 TB/mo through NAT, that's ~$45 in processing fees eliminated, roughly breaking even on endpoint costs. Above 1 TB/mo, savings scale linearly. The IPv6 egress savings are pure upside.

Decisions

  • IPv6 on by default — new stacks get dual-stack automatically; existing stacks can set enableIpv6: false
  • Keep NAT Gateways — managed HA, zero ops; IPv6 + endpoints dramatically reduce their data processing costs
  • Full ECS endpoint set — ECR, CloudWatch Logs, STS, Secrets Manager, SSM cover the common ECS workload needs