Files
spicy-automation/docs/plans/2026-02-14-dual-stack-vpc-egress-optimization-design.md
Ryan Wilson 2e02b02023 Add dual-stack IPv6 VPC with egress optimization and VPC endpoints
- Add Amazon-provided IPv6 /56 CIDR block with auto-carved /64 per subnet
- Add Egress-Only Internet Gateway for free IPv6 outbound from private subnets
- Add IPv6 routes: public subnets via IGW, private subnets via EOIGW
- Add IPv6 NACL entries for subnet tier 2
- Add DynamoDB gateway endpoint (free, alongside existing S3)
- Add 6 interface endpoints: ECR, ECR Docker, CloudWatch Logs, STS,
  Secrets Manager, SSM with shared security group
- Add enableIpv6 prop (default true) and interfaceEndpoints config
- Update VPC stack with context params for all new features
- Include design doc and implementation plan

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 17:18:18 -08:00

96 lines
3.8 KiB
Markdown

# Dual-Stack VPC with Egress Optimization
**Date:** 2026-02-14
**Status:** Approved
## Problem
The VPC is IPv4-only. All private subnet egress routes through NAT Gateways, incurring:
- $0.045/hr per gateway (~$128/mo for 4 AZs)
- $0.045/GB data processing for all traffic through NAT
Most of this traffic is AWS-to-AWS (ECR pulls, CloudWatch logs, S3) and could bypass NAT entirely.
## Solution
Add IPv6 dual-stack support and VPC endpoints to the `SpicyVpc` construct.
### Traffic flow after changes
```
Private subnet egress paths (in priority order):
1. VPC Endpoints → AWS services (S3, ECR, DynamoDB, etc.) — free or ~$0.01/hr
2. EOIGW (IPv6) → IPv6-capable internet destinations — free
3. NAT Gateway → IPv4-only internet destinations — $0.045/GB (fallback only)
Inbound to private subnets:
- ALB (public subnet) → private subnet via target group (unchanged)
- No direct internet access (private IPs + EOIGW is outbound-only)
```
### Security model
Private subnets remain private. The Egress-Only Internet Gateway is a stateful firewall:
- Allows outbound IPv6 connections and their return traffic
- Drops all unsolicited inbound IPv6 traffic
- No address translation (unlike NAT) — instances use globally routable IPv6 but are not reachable from the internet
The ALB in public subnets remains the only ingress path to private services.
## Changes to `spicy-vpc.ts`
### New props
```typescript
/** Enable IPv6 dual-stack (adds Amazon-provided IPv6 CIDR, EOIGW, IPv6 routes) @default true */
readonly enableIpv6?: boolean;
/** VPC Interface Endpoints to create. @default full ECS set when createVpcEndpoints is true */
readonly vpcInterfaceEndpoints?: VpcInterfaceEndpointConfig;
```
### Infrastructure additions
1. **VPC IPv6 CIDR**`CfnVPCCidrBlock` with `amazonProvidedIpv6CidrBlock: true` (assigns a /56)
2. **Subnet IPv6 CIDRs** — Each subnet gets a /64 carved from the VPC's /56, auto-calculated by AZ index
3. **Egress-Only Internet Gateway**`CfnEgressOnlyInternetGateway` attached to the VPC
4. **Public subnet IPv6 route**`::/0` via IGW (already exists, just needs the route)
5. **Private subnet IPv6 routes**`::/0` via EOIGW on all private route tables
6. **DynamoDB Gateway Endpoint** — Free, added alongside existing S3 endpoint
7. **Interface Endpoints** — ECR API, ECR Docker, CloudWatch Logs, STS, Secrets Manager, SSM
- Each gets a security group allowing HTTPS (443) from the VPC CIDR
- Deployed into private subnet 1 across all AZs
- Private DNS enabled (transparent to applications)
### What does NOT change
- NAT Gateways (kept as IPv4 fallback, unchanged config)
- Subnet IPv4 CIDRs
- All existing CloudFormation logical IDs
- ALB, ECS cluster, and ECS service constructs
- Cross-stack export names and values
### New outputs
- `EgressOnlyInternetGatewayId` — for cross-stack reference if needed
- `VpcIpv6CidrBlock` — the assigned /56
## Cost impact (4 AZ deployment)
| Component | Before | After |
|-----------|--------|-------|
| NAT Gateway hourly | $128/mo | $128/mo (unchanged) |
| NAT data processing | $0.045/GB (all traffic) | Near zero (IPv4-only fallback) |
| S3 endpoint | Free | Free |
| DynamoDB endpoint | N/A | Free |
| Interface endpoints (6) | N/A | ~$43/mo |
| IPv6 egress | N/A | Free |
**Net savings** depend on traffic volume. At 1 TB/mo through NAT, that's ~$45 in processing fees eliminated, roughly breaking even on endpoint costs. Above 1 TB/mo, savings scale linearly. The IPv6 egress savings are pure upside.
## Decisions
- **IPv6 on by default** — new stacks get dual-stack automatically; existing stacks can set `enableIpv6: false`
- **Keep NAT Gateways** — managed HA, zero ops; IPv6 + endpoints dramatically reduce their data processing costs
- **Full ECS endpoint set** — ECR, CloudWatch Logs, STS, Secrets Manager, SSM cover the common ECS workload needs