Capacity Planning
๐ฅ Vibe Prompt
"Forecast capacity needs for a service growing 20% MoM. Plan compute, storage, and budget."
Growth Forecasting
def forecast(current_capacity, monthly_growth, months):
return [current_capacity * (1 + monthly_growth) ** m for m in range(months)]
# Example: 100 req/s, growing 20% MoM
for month, capacity in enumerate(forecast(100, 0.20, 12)):
print(f"Month {month}: {capacity:.0f} req/s")
if capacity > 1000:
print(f" ** Need scale-up at month {month}!")
break
Resource Planning
Metric Current 6mo 12mo
Requests/sec 100 249 892
CPU cores 8 20 72
Memory (GB) 32 80 288
Storage (TB) 1 2.5 9
Monthly cost $1,000 $2,500 $9,000
Auto-scaling
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Load Testing
brew install bombardier
bombardier -c 100 -d 60s https://api.example.com/health
Cost Optimization
| Strategy | Savings | |----------|---------| | Reserved instances | 30-60% | | Spot instances | 60-90% | | Rightsizing | 20-40% | | Auto-scaling | 30-50% |
Best Practices
- Always plan for 2-3x headroom
- Load test before major launches
- Use predictive scaling
- Right-size before optimizing
Chapter Summary
- Understand core concepts and principles
- Master implementation methods and techniques
- Familiar with common issues and solutions
- Able to apply in real projects
Further Reading
- Official documentation and API references
- Open source examples on GitHub
- Technical books and online courses
- Community discussions and tech blogs
Implementation Example
Basic Example
# This section provides a complete implementation example
Steps
- Setup: Configure development environment
- Data: Prepare required data
- Implementation: Build core functionality
- Testing: Verify correctness
- Optimization: Improve performance
Common Errors
| Error Type | Cause | Solution | |------------|-------|----------| | Compilation | Syntax | Check code syntax | | Runtime | Environment | Verify dependencies installed | | Logic | Algorithm | Step-by-step debugging | | Performance | Efficiency | Use profilers |
Code Example
import sys
def main():
print("Hello, World!")
if __name__ == "__main__":
main()
References
- Official documentation
- API reference
- Open source examples
- Community discussions
Capacity Planning Process
Capacity planning ensures you have enough resources to meet current and future demand.
The Four Steps
| Step | What You Do | Tools | |------|-------------|-------| | Measure | Collect current resource usage | Prometheus, CloudWatch | | Forecast | Predict future demand | Linear regression, ML | | Plan | Determine needed resources | Spreadsheets, calculators | | Execute | Provision resources | Terraform, Kubernetes HPA |
Key Metrics
| Resource | What to Measure | Alert Threshold | |----------|----------------|----------------| | CPU | Utilization percentage | > 80% for 5m | | Memory | Used / Total | > 80% for 5m | | Disk | Used / Total | > 85% | | Network | Bandwidth usage | > 70% of max | | Database | Connection count, query latency | > 80% of max connections | | API Rate | Requests per second | > 70% of estimated capacity |
Forecasting Models
Simple Linear Growth
import numpy as np
from datetime import datetime, timedelta
def forecast_capacity(historical_data: list, days_ahead: int = 90):
"""Simple linear forecast for capacity planning."""
days = np.arange(len(historical_data))
values = np.array(historical_data)
# Linear regression
slope, intercept = np.polyfit(days, values, 1)
# Forecast
future_days = np.arange(len(historical_data), len(historical_data) + days_ahead)
forecast = slope * future_days + intercept
return forecast
# Example: Daily peak requests over the last 30 days
daily_peak_requests = [
85000, 82000, 91000, 88000, 95000, 93000, 89000,
92000, 96000, 94000, 98000, 101000, 97000, 99000,
102000, 105000, 100000, 103000, 108000, 106000,
109000, 112000, 107000, 110000, 115000, 113000,
116000, 118000, 114000, 117000
]
forecast_90 = forecast_capacity(daily_peak_requests, 90)
print(f"Current peak: {daily_peak_requests[-1]:,} req/s")
print(f"Forecast 90d: {forecast_90[-1]:,.0f} req/s")
print(f"Growth: {(forecast_90[-1] / daily_peak_requests[-1] - 1) * 100:.1f}%")
Auto-Scaling Configuration
# Kubernetes HPA โ Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
Cost Optimization
| Strategy | Savings | Implementation | |----------|---------|----------------| | Right-sizing | 20-40% | Match instance type to workload | | Reserved instances | 30-60% | 1-3 year commitments | | Spot instances | 60-90% | Fault-tolerant, batch workloads | | Auto-scaling | 20-50% | Scale down during low traffic | | Storage tiering | 40-60% | Move old data to cold storage | | Delete unused resources | 5-15% | Find and remove orphaned resources |
Right-Sizing Script
#!/bin/bash
# rightsize.sh โ Identify over-provisioned resources
echo "Checking for over-provisioned EC2 instances..."
aws ec2 describe-instances --query 'Reservations[].Instances[?(
CpuOptions.ThreadsPerCore==`2` &&
Placement.Tenancy==`default`
)].[InstanceId,InstanceType,State.Name]' --output table
echo ""
echo "CPU utilization over last 14 days (instances < 10% avg):"
aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUUtilization \
--period 86400 \
--start-time $(date -u -d "14 days ago" +%Y-%m-%dT%H:%M:%SZ) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
--statistics Average \
--query 'Datapoints[?Average<`10`]'
Summary
Capacity planning measures current usage, forecasts future demand, and provisions resources accordingly. Auto-scaling and right-sizing optimize costs while maintaining performance.
Key takeaways:
- Four steps: measure โ forecast โ plan โ execute |
- Track CPU, memory, disk, network, database, API rate |
- Linear regression forecasts future resource needs |
- Kubernetes HPA auto-scales based on CPU/memory utilization |
- Right-sizing: match instance type to actual workload |
- Reserved instances save 30-60%, spot saves 60-90% |
- Auto-scaling handles traffic spikes automatically |
- Regular audits prevent wasted resources |
What's Next: Chaos Engineering
The next chapter covers chaos engineering.