Scaling Website Infrastructure Under Load Growth
Scaling is not replacing a small server with a bigger one. It's an architectural process: first optimization, then horizontal scaling, then vertical (if needed). Wrong order leads to budget waste without solving the problem.
Diagnosing Bottlenecks
Before scaling — understand where the bottleneck is:
# CPU-bound?
top -b -n 1 | head -20
# I/O-bound?
iostat -x 1 5
# Memory-bound?
free -m && vmstat 1 5
# Database?
mysql -e "SHOW PROCESSLIST;"
psql -c "SELECT pid, now() - pg_stat_activity.query_start AS duration, query FROM pg_stat_activity WHERE state != 'idle' ORDER BY duration DESC LIMIT 10;"
# Network?
ss -s
Load testing tools:
# k6
k6 run --vus 100 --duration 30s script.js
# ab (Apache Bench)
ab -n 10000 -c 100 https://mysite.com/
# wrk
wrk -t12 -c400 -d30s https://mysite.com/
Scaling Layers
1. Caching (cheapest ROI)
# Nginx: caching static assets
location ~* \.(css|js|jpg|png|gif|ico|woff2)$ {
expires 1y;
add_header Cache-Control "public, immutable";
}
# FastCGI cache for PHP
fastcgi_cache_path /tmp/nginx-cache levels=1:2 keys_zone=MYAPP:100m inactive=60m;
fastcgi_cache_key "$scheme$request_method$host$request_uri";
# Redis for application caching
# Laravel:
php artisan config:cache
php artisan route:cache
php artisan view:cache
2. CDN
// Cloudflare/CloudFront in front of server
// Rule: static cached on Edge, API passes through
const cacheControl = isStatic ? 'public, max-age=31536000' : 'no-cache';
3. Database optimization
-- Find slow queries
EXPLAIN ANALYZE SELECT * FROM products WHERE category_id = 5 ORDER BY created_at DESC LIMIT 20;
-- Add index
CREATE INDEX CONCURRENTLY idx_products_category_created
ON products (category_id, created_at DESC);
-- PostgreSQL connection pooling via PgBouncer
4. Horizontal Scaling
# Kubernetes HPA (Horizontal Pod Autoscaler)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
5. Service Separation
Monolith → Microservices (gradually):
├── API Gateway (Nginx/Kong)
├── Auth Service (stateless JWT)
├── Content Service
├── Media Service (separate for uploads)
└── Search Service (Elasticsearch)
6. Queues for Heavy Operations
// Instead of synchronous processing — queue
// Laravel Queue + Redis
dispatch(new ProcessImageJob($uploadedFile->path()));
dispatch(new SendEmailBatchJob($users));
dispatch(new GeneratePdfReportJob($orderId));
Architectural Solutions by Load Level
| RPS | Architecture | Infrastructure Cost |
|---|---|---|
| Up to 50 | 1 VPS + Redis + PgBouncer | $20–50/month |
| 50–500 | 2–3 App + LB + RDS/managed DB | $200–500/month |
| 500–5000 | Kubernetes + CloudFront + ElastiCache + Aurora | $1000–5000/month |
| 5000+ | Multi-regional K8s + DynamoDB/Cassandra | $5000+/month |
Rule: scale what's measured as a bottleneck. Don't scale assumptions.
Audit and optimization under load (without architecture change) — 1–2 weeks. Switch to horizontal scaling — 2–6 weeks.







