Rate Limiting Vulnerabilities
Why Rate Limiting Matters
Imagine a login endpoint with no rate limiting. An attacker can try 10,000 passwords per minute — and given enough time, they will find the correct one. This is called a brute-force attack. Without rate limiting, every authentication endpoint becomes a slot machine where the attacker pulls the lever thousands of times per second.
Rate limiting is the API's first line of defense against automated abuse. It prevents:
- Brute-force attacks: Repeated login attempts
- Credential stuffing: Trying leaked username/password pairs from other breaches
- DDoS attacks: Overwhelming the server with requests
- Web scraping: Automated data extraction
- Inventory hoarding: Bots reserving limited stock
Why this matters for your career:
- Missing rate limiting is one of the most common bug bounty findings
- Rate limiting design is a standard interview topic for backend engineers
- Understanding bypass techniques helps you build more robust systems
- Rate limiting is a key component of any production API gateway
What Is Rate Limiting?
Rate limiting controls how many requests a client can make to an API within a specific time window. Think of it like a bouncer at a club who only lets in 50 people per hour — once the limit is reached, everyone else waits.
Rate Limiting Strategies
| Algorithm | How It Works | Best For | |:-----------|:-------------|:--------| | Token Bucket | Tokens refill at a fixed rate; each request consumes one token. Bursts allowed up to bucket capacity. | APIs with varying traffic patterns | | Sliding Window Log | Tracks timestamps of each request in a window. Rejects if count exceeds limit. | Precise enforcement needed | | Sliding Window Counter | Combines current and previous window counts for smooth rate limiting. | Production APIs (Redis) | | Fixed Window | Resets counter at each window boundary. Simple but allows bursts at boundaries. | Non-critical endpoints | | Leaky Bucket | Processes requests at a constant rate. Queues excess requests. | Stable throughput required |
Client IP vs User-Based Rate Limiting
| Key | Pros | Cons | |:-----|:------|:------| | Client IP | Simple, no auth required | Shared IPs (NAT, office VPNs) punished; attackers rotate IPs | | User ID (JWT sub) | Per-user fairness | Requires authentication | | API Key | Per-application accountability | Attacker can rotate API keys | | Hybrid (IP + User ID) | Best protection — blocks IP rotation + per-user limits | Most complex to implement |
How to Test Rate Limiting
Step 1: Basic Rate Limit Check
Send a burst of requests and observe when the API starts returning 429 (Too Many Requests):
import requests
import time
BASE_URL = "https://api.target.com/login"
def test_rate_limit():
"""Send 50 rapid requests and count 429 responses."""
headers = {"Content-Type": "application/json"}
data = {"username": "test", "password": "wrong"}
rate_limited_count = 0
success_count = 0
for i in range(50):
r = requests.post(BASE_URL, json=data, headers=headers)
if r.status_code == 429:
rate_limited_count += 1
elif r.status_code in (200, 401):
success_count += 1
# Print every 10th request to track progress
if i % 10 == 9:
print(f"Request {i+1}/50: {success_count} success, {rate_limited_count} rate-limited")
print(f"\nResult: {success_count} successful, {rate_limited_count} rate-limited")
if rate_limited_count == 0:
print("[!] No rate limiting detected — high risk!")
elif rate_limited_count < 40:
print("[!] Weak rate limiting — attacker can still brute force slowly")
else:
print("[+] Strong rate limiting in place")
test_rate_limit()
Step 2: Bypass Attempts
Try these common bypass techniques:
# 1. IP rotation with different X-Forwarded-For headers
curl -X POST https://api.target.com/login -H "X-Forwarded-For: 1.1.1.1"
curl -X POST https://api.target.com/login -H "X-Forwarded-For: 2.2.2.2"
curl -X POST https://api.target.com/login -H "X-Forwarded-For: 3.3.3.3"
# 2. Slow drip — stay under the limit
# Instead of 100 req/min, try 1 req/sec for 100 seconds
for i in $(seq 1 100); do
curl -s -o /dev/null -w "%{http_code}\n" https://api.target.com/search &> /dev/null
sleep 1
done
# 3. Use multiple endpoints (each may have separate limits)
curl https://api.target.com/login
curl https://api.target.com/api/auth
curl https://api.target.com/v2/authenticate
# 4. Check if limit resets by waiting
for i in $(seq 1 30); do
curl -s https://api.target.com/api/resource
sleep 2
done
Step 3: Check Response Headers
Well-configured APIs return rate limit info in headers:
RateLimit-Limit: 100
RateLimit-Remaining: 42
RateLimit-Reset: 3600
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 42
X-RateLimit-Reset: 1700000000
Retry-After: 57
Implementing Rate Limiting in FastAPI
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.trustedhost import TrustedHostMiddleware
import time
from collections import defaultdict
app = FastAPI()
# In-memory rate limiter (for demo only; use Redis in production)
class RateLimiter:
def __init__(self):
self.requests = defaultdict(list)
def is_allowed(self, key: str, max_requests: int, window_seconds: int) -> bool:
now = time.time()
window_start = now - window_seconds
# Clean old entries
self.requests[key] = [t for t in self.requests[key] if t > window_start]
if len(self.requests[key]) >= max_requests:
return False
self.requests[key].append(now)
return True
rate_limiter = RateLimiter()
@app.middleware("http")
async def rate_limit_middleware(request: Request, call_next):
# Identify client by IP or API key
client_ip = request.client.host
api_key = request.headers.get("X-API-Key")
# Use API key if available, otherwise fall back to IP
key = api_key or client_ip
# Different limits for different endpoints
if "/login" in request.url.path:
max_req, window = 5, 60 # 5 requests per minute for login
elif "/api/" in request.url.path:
max_req, window = 100, 60 # 100 requests per minute for API
else:
max_req, window = 30, 60 # 30 requests per minute for general
if not rate_limiter.is_allowed(key, max_req, window):
raise HTTPException(
status_code=429,
detail="Too many requests. Please try again later.",
headers={"Retry-After": str(window)}
)
response = await call_next(request)
response.headers["X-RateLimit-Limit"] = str(max_req)
response.headers["X-RateLimit-Remaining"] = str(
max_req - len(rate_limiter.requests.get(key, []))
)
return response
@app.get("/api/resource")
async def get_resource():
return {"message": "This endpoint is rate limited"}
@app.post("/login")
async def login():
return {"message": "Login endpoint — aggressively rate limited"}
Production: Redis-Based Rate Limiting
import redis
import time
redis_client = redis.Redis(host="localhost", port=6379, decode_responses=True)
def sliding_window_rate_limit(user_id: str, max_requests: int, window: int) -> bool:
"""
Sliding window rate limiter using Redis sorted sets.
Args:
user_id: Unique identifier for the client
max_requests: Maximum requests allowed in the window
window: Time window in seconds
Returns:
True if request is allowed, False if rate limited
"""
now = time.time()
key = f"ratelimit:{user_id}"
window_start = now - window
# Remove entries outside the window
redis_client.zremrangebyscore(key, 0, window_start)
# Count requests in the window
current_count = redis_client.zcard(key)
if current_count >= max_requests:
# Get the oldest entry's timestamp to calculate retry-after
oldest = redis_client.zrange(key, 0, 0, withscores=True)
if oldest:
retry_after = int(window - (now - oldest[0][1]))
return False, retry_after
return False, window
# Add current request
redis_client.zadd(key, {str(now): now})
redis_client.expire(key, window)
return True, 0
Rate Limiting Bypass Techniques
| Bypass Method | How It Works | Detection | Prevention |
|:--------------|:-------------|:----------|:-----------|
| IP rotation | Attacker uses proxies/VPN to rotate IPs | Monitor unique IPs per user in a short window | Use user-based (JWT sub) rate limiting, not just IP |
| Header spoofing | Attacker sets X-Forwarded-For to fake IP | Validate header chain — use the last trusted proxy IP | Configure reverse proxy (NGINX) to strip untrusted headers |
| Slow drip | Stay just under the rate limit over a long period | Monitor sustained low-rate patterns | Set cumulative daily/weekly limits in addition to per-minute limits |
| Multiple endpoints | Each endpoint may have a separate limit | Correlate request patterns across endpoints | Use a global rate limiter shared across all endpoints |
| Distributed attack | Botnet from many IPs simultaneously | Statistical anomaly detection on user behavior | Use CAPTCHA after suspicious patterns |
| Reset exploitation | Wait for window reset and immediately send burst | Track reset timing patterns | Use sliding window instead of fixed window |
Common Pitfalls
| Mistake | Why It Is Dangerous | Fix |
|:--------|:-------------------|:----|
| Rate limiting by IP only | Office NAT, VPN, or cloud IP pools share one limit | Combine IP + user-based identification |
| Fixed window at natural boundaries (per minute) | All requests rush at :00 mark | Use sliding window or randomize window start |
| No rate limiting on read endpoints | Attackers can scrape all data via GET endpoints | Rate limit all endpoints, not just writes |
| Revealing limit reset time in clear text | Attacker synchronizes attacks with resets | Use sliding window (no fixed reset) |
| Rate limit too high | 1000 req/min is useless against automated tools | Start aggressive: 5 req/min for auth, 30 req/min for API |
| No rate limiting on error responses | Attacker brute-forces without triggering limits | Count all responses, including 4xx and 5xx |
Summary
Rate limiting is not optional — it is a fundamental security control that every production API must implement.
- What rate limiting is: Controlling request volume per client within a time window
- Why it matters: Prevents brute-force, credential stuffing, scraping, and DDoS
- How to implement: Token bucket for bursts, sliding window for precision, Redis for distributed systems
- How to test: Send bursts looking for 429 responses, try bypass techniques (IP rotation, slow drip, header spoofing)
- How to bypass: IP rotation, header manipulation, slow sustained rates, distributed attacks
- Best practices: Use Redis sliding window, combine IP + user rate limiting, set aggressive limits on auth endpoints
What Is Next: Full API Pentest Report
Now that you understand every major API vulnerability — from SQL injection to JWT attacks to rate limiting bypass — the final chapter ties everything together into a complete penetration testing workflow. You will learn how to chain vulnerabilities for maximum impact, write professional penetration test reports, and communicate findings to development teams and stakeholders.