Infrastructure as Code (IaC) isn’t just for enterprise teams. As a startup, you can build production-ready AWS infrastructure in a weekend using Terraform’s reusable modules. This pragmatic approach helps you scale fast while maintaining reliability and cost control.
Table of Contents
Why Startups Need IaC From Day One
Many startups delay infrastructure automation, thinking it’s premature optimization. This is a costly mistake. IaC provides:
- Reproducible environments – Dev, staging, and prod are identical
- Version control – Infrastructure changes are tracked and reviewable
- Cost optimization – Resources are defined explicitly, preventing drift
- Team scaling – New developers can spin up environments instantly
- Disaster recovery – Rebuild your entire stack with one command
Weekend Roadmap
Day 1: Foundation & VPC
- Set up Terraform workspace
- Build networking foundation
- Configure security groups and NACLs
Day 2: Application Infrastructure
- Deploy ECS Fargate cluster
- Set up RDS database
- Configure monitoring and alerts
Day 1: Building the Foundation
Project Structure
Start with a clean, modular structure:
terraform/
├── environments/
│ ├── dev/
│ ├── staging/
│ └── prod/
├── modules/
│ ├── vpc/
│ ├── ecs/
│ └── rds/
├── shared/
│ └── variables.tf
└── README.md
VPC Module (modules/vpc/main.tf
)
variable "environment" {
description = "Environment name"
type = string
}
variable "vpc_cidr" {
description = "CIDR block for VPC"
type = string
default = "10.0.0.0/16"
}
variable "availability_zones" {
description = "Availability zones"
type = list(string)
default = ["us-east-1a", "us-east-1b"]
}
# VPC
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "${var.environment}-vpc"
Environment = var.environment
}
}
# Internet Gateway
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = {
Name = "${var.environment}-igw"
Environment = var.environment
}
}
# Public Subnets
resource "aws_subnet" "public" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index)
availability_zone = var.availability_zones[count.index]
map_public_ip_on_launch = true
tags = {
Name = "${var.environment}-public-${count.index + 1}"
Environment = var.environment
Type = "Public"
}
}
# Private Subnets
resource "aws_subnet" "private" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index + 100)
availability_zone = var.availability_zones[count.index]
tags = {
Name = "${var.environment}-private-${count.index + 1}"
Environment = var.environment
Type = "Private"
}
}
# NAT Gateways
resource "aws_eip" "nat" {
count = length(aws_subnet.public)
domain = "vpc"
depends_on = [aws_internet_gateway.main]
tags = {
Name = "${var.environment}-nat-eip-${count.index + 1}"
Environment = var.environment
}
}
resource "aws_nat_gateway" "main" {
count = length(aws_subnet.public)
allocation_id = aws_eip.nat[count.index].id
subnet_id = aws_subnet.public[count.index].id
tags = {
Name = "${var.environment}-nat-${count.index + 1}"
Environment = var.environment
}
depends_on = [aws_internet_gateway.main]
}
# Route Tables
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}
tags = {
Name = "${var.environment}-public-rt"
Environment = var.environment
}
}
resource "aws_route_table" "private" {
count = length(aws_nat_gateway.main)
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.main[count.index].id
}
tags = {
Name = "${var.environment}-private-rt-${count.index + 1}"
Environment = var.environment
}
}
# Route Table Associations
resource "aws_route_table_association" "public" {
count = length(aws_subnet.public)
subnet_id = aws_subnet.public[count.index].id
route_table_id = aws_route_table.public.id
}
resource "aws_route_table_association" "private" {
count = length(aws_subnet.private)
subnet_id = aws_subnet.private[count.index].id
route_table_id = aws_route_table.private[count.index].id
}
# Security Group for ALB
resource "aws_security_group" "alb" {
name_prefix = "${var.environment}-alb-"
vpc_id = aws_vpc.main.id
ingress {
description = "HTTP"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "HTTPS"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.environment}-alb-sg"
Environment = var.environment
}
lifecycle {
create_before_destroy = true
}
}
# Security Group for ECS
resource "aws_security_group" "ecs" {
name_prefix = "${var.environment}-ecs-"
vpc_id = aws_vpc.main.id
ingress {
description = "HTTP from ALB"
from_port = 80
to_port = 80
protocol = "tcp"
security_groups = [aws_security_group.alb.id]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.environment}-ecs-sg"
Environment = var.environment
}
lifecycle {
create_before_destroy = true
}
}
# Outputs
output "vpc_id" {
description = "ID of the VPC"
value = aws_vpc.main.id
}
output "public_subnet_ids" {
description = "IDs of the public subnets"
value = aws_subnet.public[*].id
}
output "private_subnet_ids" {
description = "IDs of the private subnets"
value = aws_subnet.private[*].id
}
output "alb_security_group_id" {
description = "ID of the ALB security group"
value = aws_security_group.alb.id
}
output "ecs_security_group_id" {
description = "ID of the ECS security group"
value = aws_security_group.ecs.id
}
Development Environment (environments/dev/main.tf
)
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = var.aws_region
}
# Variables
variable "aws_region" {
description = "AWS region"
type = string
default = "us-east-1"
}
variable "environment" {
description = "Environment name"
type = string
default = "dev"
}
# VPC Module
module "vpc" {
source = "../../modules/vpc"
environment = var.environment
vpc_cidr = "10.0.0.0/16"
availability_zones = ["us-east-1a", "us-east-1b"]
}
# Outputs
output "vpc_id" {
value = module.vpc.vpc_id
}
Day 2: Application Infrastructure
ECS Fargate Module (modules/ecs/main.tf
)
variable "environment" {
description = "Environment name"
type = string
}
variable "vpc_id" {
description = "VPC ID"
type = string
}
variable "private_subnet_ids" {
description = "Private subnet IDs"
type = list(string)
}
variable "public_subnet_ids" {
description = "Public subnet IDs"
type = list(string)
}
variable "ecs_security_group_id" {
description = "ECS security group ID"
type = string
}
variable "alb_security_group_id" {
description = "ALB security group ID"
type = string
}
variable "app_name" {
description = "Application name"
type = string
default = "webapp"
}
variable "app_port" {
description = "Application port"
type = number
default = 3000
}
variable "desired_count" {
description = "Desired number of tasks"
type = number
default = 2
}
variable "cpu" {
description = "CPU units"
type = number
default = 256
}
variable "memory" {
description = "Memory in MB"
type = number
default = 512
}
# ECS Cluster
resource "aws_ecs_cluster" "main" {
name = "${var.environment}-cluster"
setting {
name = "containerInsights"
value = "enabled"
}
tags = {
Name = "${var.environment}-cluster"
Environment = var.environment
}
}
# ECS Task Definition
resource "aws_ecs_task_definition" "app" {
family = "${var.environment}-${var.app_name}"
execution_role_arn = aws_iam_role.ecs_task_execution_role.arn
task_role_arn = aws_iam_role.ecs_task_role.arn
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu = var.cpu
memory = var.memory
container_definitions = jsonencode([
{
name = var.app_name
image = "nginx:latest" # Replace with your app image
portMappings = [
{
containerPort = var.app_port
protocol = "tcp"
}
]
logConfiguration = {
logDriver = "awslogs"
options = {
awslogs-group = aws_cloudwatch_log_group.app.name
awslogs-region = data.aws_region.current.name
awslogs-stream-prefix = "ecs"
}
}
environment = [
{
name = "ENVIRONMENT"
value = var.environment
}
]
}
])
tags = {
Name = "${var.environment}-${var.app_name}"
Environment = var.environment
}
}
# Application Load Balancer
resource "aws_lb" "main" {
name = "${var.environment}-alb"
internal = false
load_balancer_type = "application"
security_groups = [var.alb_security_group_id]
subnets = var.public_subnet_ids
enable_deletion_protection = false
tags = {
Name = "${var.environment}-alb"
Environment = var.environment
}
}
resource "aws_lb_target_group" "app" {
name = "${var.environment}-${var.app_name}-tg"
port = var.app_port
protocol = "HTTP"
vpc_id = var.vpc_id
target_type = "ip"
health_check {
enabled = true
healthy_threshold = "3"
interval = "30"
matcher = "200"
path = "https://dev.to/"
port = "traffic-port"
protocol = "HTTP"
timeout = "5"
unhealthy_threshold = "2"
}
tags = {
Name = "${var.environment}-${var.app_name}-tg"
Environment = var.environment
}
}
resource "aws_lb_listener" "front_end" {
load_balancer_arn = aws_lb.main.arn
port = "80"
protocol = "HTTP"
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.app.arn
}
}
# ECS Service
resource "aws_ecs_service" "main" {
name = "${var.environment}-${var.app_name}"
cluster = aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.app.arn
desired_count = var.desired_count
launch_type = "FARGATE"
network_configuration {
security_groups = [var.ecs_security_group_id]
subnets = var.private_subnet_ids
assign_public_ip = false
}
load_balancer {
target_group_arn = aws_lb_target_group.app.arn
container_name = var.app_name
container_port = var.app_port
}
depends_on = [aws_lb_listener.front_end]
tags = {
Name = "${var.environment}-${var.app_name}"
Environment = var.environment
}
}
# CloudWatch Log Group
resource "aws_cloudwatch_log_group" "app" {
name = "/ecs/${var.environment}/${var.app_name}"
retention_in_days = 30
tags = {
Name = "${var.environment}-${var.app_name}-logs"
Environment = var.environment
}
}
# IAM Roles
resource "aws_iam_role" "ecs_task_execution_role" {
name = "${var.environment}-ecsTaskExecutionRole"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ecs-tasks.amazonaws.com"
}
}
]
})
tags = {
Name = "${var.environment}-ecsTaskExecutionRole"
Environment = var.environment
}
}
resource "aws_iam_role_policy_attachment" "ecs_task_execution_role" {
role = aws_iam_role.ecs_task_execution_role.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}
resource "aws_iam_role" "ecs_task_role" {
name = "${var.environment}-ecsTaskRole"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ecs-tasks.amazonaws.com"
}
}
]
})
tags = {
Name = "${var.environment}-ecsTaskRole"
Environment = var.environment
}
}
# Data Sources
data "aws_region" "current" {}
# Outputs
output "cluster_id" {
description = "ECS cluster ID"
value = aws_ecs_cluster.main.id
}
output "alb_dns_name" {
description = "ALB DNS name"
value = aws_lb.main.dns_name
}
output "alb_zone_id" {
description = "ALB zone ID"
value = aws_lb.main.zone_id
}
RDS Module (modules/rds/main.tf
)
variable "environment" {
description = "Environment name"
type = string
}
variable "vpc_id" {
description = "VPC ID"
type = string
}
variable "private_subnet_ids" {
description = "Private subnet IDs"
type = list(string)
}
variable "allowed_security_groups" {
description = "Security groups allowed to access RDS"
type = list(string)
default = []
}
variable "db_name" {
description = "Database name"
type = string
default = "appdb"
}
variable "db_username" {
description = "Database username"
type = string
default = "dbadmin"
}
variable "db_password" {
description = "Database password"
type = string
sensitive = true
}
variable "instance_class" {
description = "RDS instance class"
type = string
default = "db.t3.micro"
}
variable "allocated_storage" {
description = "Allocated storage in GB"
type = number
default = 20
}
variable "backup_retention_period" {
description = "Backup retention period in days"
type = number
default = 7
}
# Security Group for RDS
resource "aws_security_group" "rds" {
name_prefix = "${var.environment}-rds-"
vpc_id = var.vpc_id
ingress {
description = "MySQL/Aurora"
from_port = 3306
to_port = 3306
protocol = "tcp"
security_groups = var.allowed_security_groups
}
tags = {
Name = "${var.environment}-rds-sg"
Environment = var.environment
}
lifecycle {
create_before_destroy = true
}
}
# DB Subnet Group
resource "aws_db_subnet_group" "default" {
name = "${var.environment}-db-subnet-group"
subnet_ids = var.private_subnet_ids
tags = {
Name = "${var.environment}-db-subnet-group"
Environment = var.environment
}
}
# RDS Instance
resource "aws_db_instance" "default" {
identifier = "${var.environment}-database"
engine = "mysql"
engine_version = "8.0"
instance_class = var.instance_class
allocated_storage = var.allocated_storage
max_allocated_storage = var.allocated_storage * 2
db_name = var.db_name
username = var.db_username
password = var.db_password
vpc_security_group_ids = [aws_security_group.rds.id]
db_subnet_group_name = aws_db_subnet_group.default.name
backup_retention_period = var.backup_retention_period
backup_window = "03:00-04:00"
maintenance_window = "sun:04:00-sun:05:00"
skip_final_snapshot = true
deletion_protection = false
performance_insights_enabled = false
monitoring_interval = 0
tags = {
Name = "${var.environment}-database"
Environment = var.environment
}
}
# Outputs
output "rds_hostname" {
description = "RDS instance hostname"
value = aws_db_instance.default.address
sensitive = true
}
output "rds_port" {
description = "RDS instance port"
value = aws_db_instance.default.port
}
output "rds_username" {
description = "RDS instance root username"
value = aws_db_instance.default.username
sensitive = true
}
Complete Environment Configuration
Update environments/dev/main.tf
:
# Add to existing dev environment
module "ecs" {
source = "../../modules/ecs"
environment = var.environment
vpc_id = module.vpc.vpc_id
private_subnet_ids = module.vpc.private_subnet_ids
public_subnet_ids = module.vpc.public_subnet_ids
ecs_security_group_id = module.vpc.ecs_security_group_id
alb_security_group_id = module.vpc.alb_security_group_id
app_name = "myapp"
desired_count = 1 # Lower for dev
cpu = 256
memory = 512
}
module "rds" {
source = "../../modules/rds"
environment = var.environment
vpc_id = module.vpc.vpc_id
private_subnet_ids = module.vpc.private_subnet_ids
allowed_security_groups = [module.vpc.ecs_security_group_id]
db_password = var.db_password
instance_class = "db.t3.micro"
backup_retention_period = 1 # Minimal for dev
}
variable "db_password" {
description = "Database password"
type = string
sensitive = true
}
output "alb_dns_name" {
value = module.ecs.alb_dns_name
}
CloudWatch Monitoring and Alerts
Add monitoring to your modules:
# CloudWatch Alarms for ECS
resource "aws_cloudwatch_metric_alarm" "high_cpu" {
alarm_name = "${var.environment}-high-cpu"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/ECS"
period = "120"
statistic = "Average"
threshold = "80"
alarm_description = "This metric monitors ecs cpu utilization"
dimensions = {
ServiceName = aws_ecs_service.main.name
ClusterName = aws_ecs_cluster.main.name
}
tags = {
Name = "${var.environment}-high-cpu-alarm"
Environment = var.environment
}
}
# SNS Topic for Alerts
resource "aws_sns_topic" "alerts" {
name = "${var.environment}-alerts"
tags = {
Name = "${var.environment}-alerts"
Environment = var.environment
}
}
Deployment Commands
Deploy your infrastructure:
# Development
cd environments/dev
terraform init
terraform plan -var="db_password=your-secure-password"
terraform apply -var="db_password=your-secure-password"
# Production (copy dev to prod with appropriate sizing)
cd ../prod
terraform init
terraform plan -var="db_password=your-secure-password"
terraform apply -var="db_password=your-secure-password"
Cost Optimization Tips
Right-Sizing Resources
- Dev: t3.micro instances, minimal RDS
- Prod: Start small and scale based on metrics
- Use Fargate Spot for non-critical workloads
Resource Scheduling
# Auto-scaling for ECS
resource "aws_appautoscaling_target" "ecs_target" {
max_capacity = 10
min_capacity = 2
resource_id = "service/${aws_ecs_cluster.main.name}/${aws_ecs_service.main.name}"
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
}
resource "aws_appautoscaling_policy" "scale_up" {
name = "${var.environment}-scale-up"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.ecs_target.resource_id
scalable_dimension = aws_appautoscaling_target.ecs_target.scalable_dimension
service_namespace = aws_appautoscaling_target.ecs_target.service_namespace
target_tracking_scaling_policy_configuration {
predefined_metric_specification {
predefined_metric_type = "ECSServiceAverageCPUUtilization"
}
target_value = 70.0
}
}
Security Best Practices
Secrets Management
# Use AWS Secrets Manager for sensitive data
resource "aws_secretsmanager_secret" "db_password" {
name = "${var.environment}/database/password"
}
resource "aws_secretsmanager_secret_version" "db_password" {
secret_id = aws_secretsmanager_secret.db_password.id
secret_string = var.db_password
}
Network Security
- All databases in private subnets
- Security groups with minimal access
- VPC Flow Logs for network monitoring
- WAF for public-facing applications
Production Considerations
State Management
Use remote state with S3 and DynamoDB locking:
terraform {
backend "s3" {
bucket = "your-terraform-state-bucket"
key = "environments/prod/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-state-lock"
}
}
Multi-Environment Strategy
- Separate AWS accounts for prod
- Environment-specific variable files
- Automated testing of Terraform changes
- GitOps workflow with pull request reviews
Conclusion
You now have production-ready AWS infrastructure that scales with your startup growth. The modular Terraform approach provides enterprise-grade reliability while remaining startup-friendly in complexity and cost.
This infrastructure foundation delivers:
- High availability across multiple AZs with automatic failover
- Scalable containerized applications using ECS Fargate
- Managed database services with automated backups and maintenance
- Comprehensive monitoring with CloudWatch metrics and alerts
- Cost optimization through right-sizing and intelligent auto-scaling
The weekend time investment in infrastructure automation creates a solid foundation that supports rapid scaling as your startup grows, while maintaining the operational reliability your customers expect.
Need help building production-ready AWS infrastructure for your startup? I specialize in Terraform consulting and can set up scalable, cost-optimized cloud infrastructure that grows with your business. Check out my DevOps services and portfolio or contact me directly to discuss your infrastructure needs.
This is part 2 of my “DevOps for Startups” series. Part 1 covered automated React deployment pipelines with GitHub Actions and AWS