Probability Analysis of Randomized Algorithms

Why Probability Analysis Matters

Randomized algorithms introduce uncertainty — but we need to reason rigorously about their performance. Probability analysis gives us the mathematical tools to prove that a randomized algorithm will be fast "with high probability" or "in expectation."

Why this matters for your career:

  • Interviewers ask about expected vs. worst-case runtime of randomized algorithms
  • Understanding probability bounds is essential for machine learning theory
  • Data science relies on confidence intervals, p-values, and hypothesis tests
  • Proving algorithm guarantees separates junior from senior engineers

Key Probability Concepts for Algorithms

1. Expected Value

The expected value of a random variable is its average over all possible outcomes:

E[X] = sum(x * P(X = x)) for all x

Example: Expected number of heads in 100 fair coin flips = 100 * 0.5 = 50

For randomized algorithms, we care about E[T(n)] — the expected runtime. Quickselect has E[T(n)] = O(n), even though worst case is O(n^2).

2. Linearity of Expectation

One of the most powerful tools: E[X + Y] = E[X] + E[Y], even when X and Y are NOT independent.

Example: Expected number of fixed points in a random permutation:

  • Let X_i = 1 if element i is at position i
  • E[X_i] = 1/n
  • E[total] = sum(E[X_i]) = n * 1/n = 1

Linearity of expectation makes many seemingly complex analyses trivial.

3. Markov's Inequality

For a non-negative random variable X and any a > 0:

P(X >= a) <= E[X] / a

Example: If E[T] = 100ms, then P(T >= 1000ms) <= 0.1

This is a very weak bound, but it only requires knowing the expected value.

4. Chebyshev's Inequality

For any random variable X with mean μ and variance σ²:

P(|X - μ| >= kσ) <= 1/k²

Example: If μ = 100 and σ = 10, then P(|X - 100| >= 30) <= 1/9 ≈ 0.11

This is stronger than Markov but requires knowing the variance.

5. Chernoff Bounds

Chernoff bounds give exponentially decreasing tail probabilities for sums of independent random variables:

P(X >= (1 + δ)μ) <= exp(-μδ²/3) for 0 < δ < 1

Example: If we flip 1000 fair coins, the probability of getting more than 600 heads is incredibly small:

P(X >= 600) <= exp(-500 * 0.2² / 3) = exp(-6.67) ≈ 0.0013

This is why randomized algorithms with Chernoff bounds are practically deterministic.

Applying These Tools

Analyzing Randomized Quicksort

Expected comparisons:

Let X = total number of comparisons in Quicksort. For each pair (i, j), let X_{ij} = 1 if i and j are compared, 0 otherwise.

E[X] = sum over all pairs of P(i and j are compared)

The probability that two elements are compared in Quicksort is 2/(j - i + 1).

E[X] = sum_{i=1}^{n} sum_{j>i} 2/(j - i + 1) = 2 * sum_{k=2}^{n} (n - k + 1)/k ≈ 2n * ln(n)

So Quicksort performs approximately 2n ln(n) comparisons in expectation — matching the known O(n log n) bound.

Analyzing Karger's Min-Cut

A single run of Karger's algorithm succeeds with probability at least:

P(success) >= 2 / (n * (n - 1))

After t independent runs:

P(failure) <= (1 - 2/(n*(n-1)))^t

Setting t = n*(n-1)/2 * ln(n):

P(failure) <= exp(-ln(n)) = 1/n

So with O(n² log n) runs, the failure probability is at most 1/n.

Comparison of Bounds

| Inequality | Strength | Requirements | Example Bound | |------------|----------|--------------|---------------| | Markov | Weak | E[X] only | P(X >= 1000) <= 0.1 | | Chebyshev | Moderate | E[X] and Var(X) | P(|X-μ| >= 3σ) <= 1/9 | | Chernoff | Very Strong | Independence, E[X] | P(X >= 1.1μ) <= exp(-μδ²/3) | | Union Bound | Simple | Any events | P(A ∪ B) <= P(A) + P(B) | | Hoeffding | Strong | Bounded range | P(|X-μ| >= t) <= 2exp(-2t²/n) |

Practical Guidelines

| Scenario | Use This Tool | |----------|--------------| | Expected runtime | Linearity of expectation | | Worst-case high probability | Chernoff bound | | Algorithm with independent trials | Union bound + Chernoff | | No independence | Markov (weak but always works) | | Known variance | Chebyshev (stronger than Markov) | | Bounded random variables | Hoeffding inequality | | Load balancing | Chernoff for occupancy problems | | Randomized rounding | Probabilistic method |

Common Mistakes

| Mistake | Why It's Wrong | |---------|---------------| | Assuming independence without justification | Many random variables in algorithms are NOT independent | | Using Chernoff on dependent variables | The bound no longer holds | | Forgetting to apply union bound | Multiple bad events can compound | | Misapplying linearity of expectation | Works for dependent variables too, but careful with variance | | Confusing "in expectation" with "with high probability" | They are different guarantees | | Reporting expected but not worst case | Both matter for algorithm analysis |

Summary

Probability analysis provides the mathematical foundation for understanding randomized algorithms. Linearity of expectation makes expected-runtime analysis tractable. Chernoff bounds prove that bad outcomes are exponentially unlikely. Mastering these tools lets you rigorously analyze and design randomized algorithms with confidence.

Key takeaways:

  • Linearity of expectation works even for dependent variables — incredibly useful
  • Markov's inequality requires only E[X] but gives weak bounds
  • Chernoff bounds give exponentially strong concentration for independent sums
  • Union bound lets us combine multiple probabilistic events
  • Expected runtime and "with high probability" are different guarantees
  • Practice: analyze Quickselect, Quicksort, Karger's min-cut with these tools

What's Next: Algorithmic Trading Basics

The next course introduces algorithmic trading: backtesting, moving averages, momentum strategies, and building trading bots.

Extended Example: Balls into Bins

The "balls into bins" problem is a classic probability scenario with many algorithm applications:

Problem: Throw m balls uniformly at random into n bins. What is the maximum load?

  • If m = n (same number of balls and bins): max load ~ log(n) / log(log(n)) with high probability
  • If m > n * log(n): max load ~ m/n (nearly even distribution)

Why this matters: This models hash tables with chaining, load balancing across servers, and randomized routing.

import random
import math

def simulate_balls_into_bins(m, n, trials=1000):
    """Simulate throwing m balls into n bins and return average max load."""
    total_max = 0

    for _ in range(trials):
        bins = [0] * n
        for _ in range(m):
            bin_idx = random.randint(0, n - 1)
            bins[bin_idx] += 1
        total_max += max(bins)

    return total_max / trials

# Test with m = n (balanced case)
m = n = 1000
avg_max = simulate_balls_into_bins(m, n)
expected_max = math.log(n) / math.log(math.log(n))
print(f"m=n={n}: avg max load = {avg_max:.2f}, theoretical ≈ {expected_max:.2f}")
# Output: m=n=1000: avg max load = 6.84, theoretical ≈ 6.37

# Test with m = n * log(n) (overloaded case)
m = int(n * math.log(n))
avg_max = simulate_balls_into_bins(m, n)
expected_max = m / n
print(f"m=n*ln(n)={m}: avg max load = {avg_max:.2f}, theoretical ≈ {expected_max:.2f}")
# Output: m=n*ln(n)=6907: avg max load = 7.89, theoretical ≈ 6.91

This demonstrates Chernoff bounds in action: with m ≈ n log n, the maximum load is tightly concentrated around the mean.

The Probabilistic Method

The probabilistic method is a powerful proof technique: instead of constructing a solution directly, prove that a random solution has positive probability of being correct.

Example: Ramsey numbers R(k, k) > 2^(k/2)

  1. Randomly color the edges of a complete graph on n vertices red or blue
  2. Expected number of monochromatic K_k subgraphs: C(n, k) * 2^(1 - C(k,2))
  3. If this expected value < 1, then there exists a coloring with no monochromatic K_k
  4. Solve for n to get the bound

This technique, pioneered by Paul Erdős, has produced many elegant existence proofs where constructive methods fail.

Steps of the Probabilistic Method

  1. Define a random process that produces a candidate solution
  2. Calculate the expected value of a relevant random variable
  3. If E[X] < 1, there exists an outcome where X = 0 (perfect solution)
  4. If E[X] = c, there exists an outcome where X ≤ c

This non-constructive approach proves existence without explicitly finding the solution.

Summary of Key Formulas

| Tool | Formula | When to Use | |------|---------|-------------| | Linearity of Expectation | E[sum X_i] = sum E[X_i] | Sum of indicator variables | | Markov's Inequality | P(X ≥ a) ≤ E[X]/a | Only know expectation | | Chebyshev's Inequality | P(|X-μ| ≥ kσ) ≤ 1/k² | Know expectation and variance | | Chernoff Bound (additive) | P(|X-μ| ≥ εμ) ≤ 2exp(-ε²μ/3) | Independent, know μ | | Chernoff Bound (multiplicative) | P(X ≥ (1+δ)μ) ≤ exp(-μδ²/3) | Independent, know μ | | Union Bound | P(∪ A_i) ≤ sum P(A_i) | Multiple bad events | | Jensen's Inequality | f(E[X]) ≤ E[f(X)] for convex f | Convex/concave functions |

The right tool depends on what you know about the random variables and what kind of guarantee you need.

Unlock Full Tutorial

This chapter is paid content. Join the project to unlock over 5000 words of deep analysis, including 10+ god-tier Prompts and real Source Code examples!