Capacity Planning

🔥 Vibe Prompt

"Forecast capacity needs for a service growing 20% MoM. Plan compute, storage, and budget."

Growth Forecasting

def forecast(current_capacity, monthly_growth, months):
    return [current_capacity * (1 + monthly_growth) ** m for m in range(months)]

# Example: 100 req/s, growing 20% MoM
for month, capacity in enumerate(forecast(100, 0.20, 12)):
    print(f"Month {month}: {capacity:.0f} req/s")
    if capacity > 1000:
        print(f"  ** Need scale-up at month {month}!")
        break

Resource Planning

Metric          Current    6mo    12mo
Requests/sec    100        249     892
CPU cores       8          20      72
Memory (GB)     32         80      288
Storage (TB)    1          2.5     9
Monthly cost    $1,000     $2,500  $9,000

Auto-scaling

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
  minReplicas: 3
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Load Testing

brew install bombardier
bombardier -c 100 -d 60s https://api.example.com/health

Cost Optimization

| Strategy | Savings | |----------|---------| | Reserved instances | 30-60% | | Spot instances | 60-90% | | Rightsizing | 20-40% | | Auto-scaling | 30-50% |

Best Practices

Always plan for 2-3x headroom
Load test before major launches
Use predictive scaling
Right-size before optimizing

Chapter Summary

Understand core concepts and principles
Master implementation methods and techniques
Familiar with common issues and solutions
Able to apply in real projects

Implementation Example

Basic Example

# This section provides a complete implementation example

Steps

Setup: Configure development environment
Data: Prepare required data
Implementation: Build core functionality
Testing: Verify correctness
Optimization: Improve performance

Common Errors

| Error Type | Cause | Solution | |------------|-------|----------| | Compilation | Syntax | Check code syntax | | Runtime | Environment | Verify dependencies installed | | Logic | Algorithm | Step-by-step debugging | | Performance | Efficiency | Use profilers |

Code Example

import sys

def main():
    print("Hello, World!")

if __name__ == "__main__":
    main()

References

Official documentation
API reference
Open source examples
Community discussions

Capacity Planning Process

Capacity planning ensures you have enough resources to meet current and future demand.

The Four Steps

| Step | What You Do | Tools | |------|-------------|-------| | Measure | Collect current resource usage | Prometheus, CloudWatch | | Forecast | Predict future demand | Linear regression, ML | | Plan | Determine needed resources | Spreadsheets, calculators | | Execute | Provision resources | Terraform, Kubernetes HPA |

Key Metrics

| Resource | What to Measure | Alert Threshold | |----------|----------------|----------------| | CPU | Utilization percentage | > 80% for 5m | | Memory | Used / Total | > 80% for 5m | | Disk | Used / Total | > 85% | | Network | Bandwidth usage | > 70% of max | | Database | Connection count, query latency | > 80% of max connections | | API Rate | Requests per second | > 70% of estimated capacity |

Forecasting Models

Simple Linear Growth

import numpy as np
from datetime import datetime, timedelta

def forecast_capacity(historical_data: list, days_ahead: int = 90):
    """Simple linear forecast for capacity planning."""
    days = np.arange(len(historical_data))
    values = np.array(historical_data)
    
    # Linear regression
    slope, intercept = np.polyfit(days, values, 1)
    
    # Forecast
    future_days = np.arange(len(historical_data), len(historical_data) + days_ahead)
    forecast = slope * future_days + intercept
    
    return forecast

# Example: Daily peak requests over the last 30 days
daily_peak_requests = [
    85000, 82000, 91000, 88000, 95000, 93000, 89000,
    92000, 96000, 94000, 98000, 101000, 97000, 99000,
    102000, 105000, 100000, 103000, 108000, 106000,
    109000, 112000, 107000, 110000, 115000, 113000,
    116000, 118000, 114000, 117000
]

forecast_90 = forecast_capacity(daily_peak_requests, 90)
print(f"Current peak: {daily_peak_requests[-1]:,} req/s")
print(f"Forecast 90d: {forecast_90[-1]:,.0f} req/s")
print(f"Growth: {(forecast_90[-1] / daily_peak_requests[-1] - 1) * 100:.1f}%")

Auto-Scaling Configuration

# Kubernetes HPA — Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 3
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
        - type: Percent
          value: 100
          periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 50
          periodSeconds: 60

Cost Optimization

| Strategy | Savings | Implementation | |----------|---------|----------------| | Right-sizing | 20-40% | Match instance type to workload | | Reserved instances | 30-60% | 1-3 year commitments | | Spot instances | 60-90% | Fault-tolerant, batch workloads | | Auto-scaling | 20-50% | Scale down during low traffic | | Storage tiering | 40-60% | Move old data to cold storage | | Delete unused resources | 5-15% | Find and remove orphaned resources |

Right-Sizing Script

#!/bin/bash
# rightsize.sh — Identify over-provisioned resources

echo "Checking for over-provisioned EC2 instances..."

aws ec2 describe-instances --query 'Reservations[].Instances[?(
  CpuOptions.ThreadsPerCore==`2` &&
  Placement.Tenancy==`default`
)].[InstanceId,InstanceType,State.Name]' --output table

echo ""
echo "CPU utilization over last 14 days (instances < 10% avg):"
aws cloudwatch get-metric-statistics \
  --namespace AWS/EC2 \
  --metric-name CPUUtilization \
  --period 86400 \
  --start-time $(date -u -d "14 days ago" +%Y-%m-%dT%H:%M:%SZ) \
  --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
  --statistics Average \
  --query 'Datapoints[?Average<`10`]'

Summary

Capacity planning measures current usage, forecasts future demand, and provisions resources accordingly. Auto-scaling and right-sizing optimize costs while maintaining performance.

Key takeaways:

Four steps: measure → forecast → plan → execute |
Track CPU, memory, disk, network, database, API rate |
Linear regression forecasts future resource needs |
Kubernetes HPA auto-scales based on CPU/memory utilization |
Right-sizing: match instance type to actual workload |
Reserved instances save 30-60%, spot saves 60-90% |
Auto-scaling handles traffic spikes automatically |
Regular audits prevent wasted resources |

What's Next: Chaos Engineering

The next chapter covers chaos engineering.

Capacity Planning

🔥 Vibe Prompt

Growth Forecasting

Resource Planning

Auto-scaling

Load Testing

Cost Optimization

Best Practices

Chapter Summary

Further Reading

Implementation Example

Basic Example

Steps

Common Errors

Code Example

References

Capacity Planning Process

The Four Steps

Key Metrics

Forecasting Models

Simple Linear Growth

Auto-Scaling Configuration

Cost Optimization

Right-Sizing Script

Summary

What's Next: Chaos Engineering

Unlock Full Tutorial