Capacity Planning

๐Ÿ”ฅ Vibe Prompt

"Forecast capacity needs for a service growing 20% MoM. Plan compute, storage, and budget."

Growth Forecasting

def forecast(current_capacity, monthly_growth, months):
    return [current_capacity * (1 + monthly_growth) ** m for m in range(months)]

# Example: 100 req/s, growing 20% MoM
for month, capacity in enumerate(forecast(100, 0.20, 12)):
    print(f"Month {month}: {capacity:.0f} req/s")
    if capacity > 1000:
        print(f"  ** Need scale-up at month {month}!")
        break

Resource Planning

Metric          Current    6mo    12mo
Requests/sec    100        249     892
CPU cores       8          20      72
Memory (GB)     32         80      288
Storage (TB)    1          2.5     9
Monthly cost    $1,000     $2,500  $9,000

Auto-scaling

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
  minReplicas: 3
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Load Testing

brew install bombardier
bombardier -c 100 -d 60s https://api.example.com/health

Cost Optimization

| Strategy | Savings | |----------|---------| | Reserved instances | 30-60% | | Spot instances | 60-90% | | Rightsizing | 20-40% | | Auto-scaling | 30-50% |

Best Practices

  • Always plan for 2-3x headroom
  • Load test before major launches
  • Use predictive scaling
  • Right-size before optimizing

Chapter Summary

  • Understand core concepts and principles
  • Master implementation methods and techniques
  • Familiar with common issues and solutions
  • Able to apply in real projects

Further Reading

  • Official documentation and API references
  • Open source examples on GitHub
  • Technical books and online courses
  • Community discussions and tech blogs

Implementation Example

Basic Example

# This section provides a complete implementation example

Steps

  1. Setup: Configure development environment
  2. Data: Prepare required data
  3. Implementation: Build core functionality
  4. Testing: Verify correctness
  5. Optimization: Improve performance

Common Errors

| Error Type | Cause | Solution | |------------|-------|----------| | Compilation | Syntax | Check code syntax | | Runtime | Environment | Verify dependencies installed | | Logic | Algorithm | Step-by-step debugging | | Performance | Efficiency | Use profilers |

Code Example

import sys

def main():
    print("Hello, World!")

if __name__ == "__main__":
    main()

References

  • Official documentation
  • API reference
  • Open source examples
  • Community discussions

Capacity Planning Process

Capacity planning ensures you have enough resources to meet current and future demand.

The Four Steps

| Step | What You Do | Tools | |------|-------------|-------| | Measure | Collect current resource usage | Prometheus, CloudWatch | | Forecast | Predict future demand | Linear regression, ML | | Plan | Determine needed resources | Spreadsheets, calculators | | Execute | Provision resources | Terraform, Kubernetes HPA |

Key Metrics

| Resource | What to Measure | Alert Threshold | |----------|----------------|----------------| | CPU | Utilization percentage | > 80% for 5m | | Memory | Used / Total | > 80% for 5m | | Disk | Used / Total | > 85% | | Network | Bandwidth usage | > 70% of max | | Database | Connection count, query latency | > 80% of max connections | | API Rate | Requests per second | > 70% of estimated capacity |

Forecasting Models

Simple Linear Growth

import numpy as np
from datetime import datetime, timedelta

def forecast_capacity(historical_data: list, days_ahead: int = 90):
    """Simple linear forecast for capacity planning."""
    days = np.arange(len(historical_data))
    values = np.array(historical_data)
    
    # Linear regression
    slope, intercept = np.polyfit(days, values, 1)
    
    # Forecast
    future_days = np.arange(len(historical_data), len(historical_data) + days_ahead)
    forecast = slope * future_days + intercept
    
    return forecast

# Example: Daily peak requests over the last 30 days
daily_peak_requests = [
    85000, 82000, 91000, 88000, 95000, 93000, 89000,
    92000, 96000, 94000, 98000, 101000, 97000, 99000,
    102000, 105000, 100000, 103000, 108000, 106000,
    109000, 112000, 107000, 110000, 115000, 113000,
    116000, 118000, 114000, 117000
]

forecast_90 = forecast_capacity(daily_peak_requests, 90)
print(f"Current peak: {daily_peak_requests[-1]:,} req/s")
print(f"Forecast 90d: {forecast_90[-1]:,.0f} req/s")
print(f"Growth: {(forecast_90[-1] / daily_peak_requests[-1] - 1) * 100:.1f}%")

Auto-Scaling Configuration

# Kubernetes HPA โ€” Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 3
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
        - type: Percent
          value: 100
          periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 50
          periodSeconds: 60

Cost Optimization

| Strategy | Savings | Implementation | |----------|---------|----------------| | Right-sizing | 20-40% | Match instance type to workload | | Reserved instances | 30-60% | 1-3 year commitments | | Spot instances | 60-90% | Fault-tolerant, batch workloads | | Auto-scaling | 20-50% | Scale down during low traffic | | Storage tiering | 40-60% | Move old data to cold storage | | Delete unused resources | 5-15% | Find and remove orphaned resources |

Right-Sizing Script

#!/bin/bash
# rightsize.sh โ€” Identify over-provisioned resources

echo "Checking for over-provisioned EC2 instances..."

aws ec2 describe-instances --query 'Reservations[].Instances[?(
  CpuOptions.ThreadsPerCore==`2` &&
  Placement.Tenancy==`default`
)].[InstanceId,InstanceType,State.Name]' --output table

echo ""
echo "CPU utilization over last 14 days (instances < 10% avg):"
aws cloudwatch get-metric-statistics \
  --namespace AWS/EC2 \
  --metric-name CPUUtilization \
  --period 86400 \
  --start-time $(date -u -d "14 days ago" +%Y-%m-%dT%H:%M:%SZ) \
  --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
  --statistics Average \
  --query 'Datapoints[?Average<`10`]'

Summary

Capacity planning measures current usage, forecasts future demand, and provisions resources accordingly. Auto-scaling and right-sizing optimize costs while maintaining performance.

Key takeaways:

  • Four steps: measure โ†’ forecast โ†’ plan โ†’ execute |
  • Track CPU, memory, disk, network, database, API rate |
  • Linear regression forecasts future resource needs |
  • Kubernetes HPA auto-scales based on CPU/memory utilization |
  • Right-sizing: match instance type to actual workload |
  • Reserved instances save 30-60%, spot saves 60-90% |
  • Auto-scaling handles traffic spikes automatically |
  • Regular audits prevent wasted resources |

What's Next: Chaos Engineering

The next chapter covers chaos engineering.

Unlock Full Tutorial

This chapter is paid content. Join the project to unlock over 5000 words of deep analysis, including 10+ god-tier Prompts and real Source Code examples!