Quickselect — Find the K-th Smallest Element in Expected O(n) Time

Why Quickselect Matters

Finding the k-th smallest element in an unsorted array is a fundamental problem with many applications: finding medians, computing percentiles, filtering outliers, and database query optimization. Quickselect solves this efficiently without sorting the entire array.

Why this matters for your career:

Quickselect is a classic interview topic (Google, Amazon, Meta)
Understanding randomized algorithms is essential for system design
The median-finding application is used in load balancing and data analysis
Quickselect introduces the important concept of expected vs. worst-case runtime

What Is Quickselect?

Quickselect is a randomized selection algorithm developed by Tony Hoare (the same Hoare who created Quicksort). It finds the k-th smallest element in an unsorted array in expected O(n) time with O(n^2) worst-case time.

How Quickselect Works

Pick a random pivot element from the array
Partition the array so all elements smaller than the pivot come before it, and all larger elements come after
If the pivot is at position k-1 (0-indexed), return it
If k-1 is less than the pivot position, recurse on the left partition
If k-1 is greater, recurse on the right partition

Unlike Quicksort which recurses on both sides, Quickselect only recurses on ONE side. This is what gives it linear expected time.

Implementation

import random

def quickselect(arr, k):
    """
    Find the k-th smallest element in arr (1-indexed, so k=1 = smallest).
    Expected time: O(n).
    """
    if not arr:
        return None
    if k < 1 or k > len(arr):
        raise ValueError("k out of bounds")

    return _quickselect(arr, k - 1)  # convert to 0-indexed


def _quickselect(arr, k_idx):
    """Internal recursive quickselect, 0-indexed k."""
    if len(arr) == 1:
        return arr[0]

    # Randomly choose pivot
    pivot = random.choice(arr)

    # Partition
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]

    pivot_start = len(left)
    pivot_end = len(left) + len(middle)

    if k_idx < pivot_start:
        return _quickselect(left, k_idx)
    elif k_idx < pivot_end:
        return pivot
    else:
        return _quickselect(right, k_idx - pivot_end)

Example Usage

arr = [7, 10, 4, 3, 20, 15]

# Find 3rd smallest element
result = quickselect(arr, 3)
print(f"3rd smallest: {result}")  # Output: 7

# Find the median
k = (len(arr) + 1) // 2
median = quickselect(arr, k)
print(f"Median: {median}")  # Output: 10

# Sorted array for verification: [3, 4, 7, 10, 15, 20]
print(f"Sorted: {sorted(arr)}")

Visualization

For arr = [7, 10, 4, 3, 20, 15] and k = 3:

Choose random pivot, say 10
Partition: left=[7, 4, 3], middle=[10], right=[20, 15]
pivot_start=3, pivot_end=4, k_idx=2 (0-indexed)
k_idx < pivot_start → recurse on left=[7, 4, 3]
Choose pivot 4: left=[3], middle=[4], right=[7]
pivot_start=1, pivot_end=2, k_idx=2 (still looking for index 2)
k_idx >= pivot_end → recurse on right=[7]
len(arr) == 1 → return 7

Time Complexity Analysis

Expected Case: O(n)

At each recursion, the algorithm works on a smaller array. On average, the pivot splits the array into two parts of roughly equal size. The recurrence is:

T(n) = T(n/2) + O(n)

By the Master Theorem, this gives O(n) expected time.

Worst Case: O(n^2)

If we are extremely unlucky and always pick the smallest or largest element as pivot, the recurrence becomes:

T(n) = T(n-1) + O(n)

This gives O(n^2). However, the probability of this happening with random pivot selection is astronomically low.

| Aspect | Expected | Worst Case | |--------|----------|------------| | Time | O(n) | O(n^2) | | Space | O(1) (iterative) | O(n) (recursive) | | Probability of worst case | 1/n^n (essentially zero) | — | | Pivot strategy | Random | Always min or max |

Comparison: Quickselect vs. Sorting

| Approach | Time | Space | Use Case | |----------|------|-------|----------| | Sort then index | O(n log n) | O(n) | Need sorted array anyway | | Quickselect | O(n) expected | O(1) | Only need one element | | Min-heap | O(n log k) | O(k) | Need k smallest elements | | Max-heap | O(n log (n-k)) | O(n-k) | Need k largest elements |

Quickselect is the best choice when you need exactly one element by rank.

Applications

| Application | How Quickselect Helps | |-------------|---------------------| | Median Finding | k = n/2 — used in statistics, load balancing | | Percentile Calculation | k = p*n/100 — used in performance analysis | | Outlier Detection | Find top 1% or bottom 1% without sorting | | Database Query Optimization | Find median for query cost estimation | | Image Processing | Median filter for noise reduction | | Streaming Data | Combined with reservoir sampling |

Iterative Version (Space-Efficient)

def quickselect_iterative(arr, k):
    """Iterative version uses O(1) extra space."""
    if k < 1 or k > len(arr):
        raise ValueError("k out of bounds")

    k_idx = k - 1
    left = 0
    right = len(arr) - 1

    while left <= right:
        # Choose random pivot
        pivot_idx = random.randint(left, right)
        pivot = arr[pivot_idx]

        # Three-way partition
        less = [x for x in arr[left:right+1] if x < pivot]
        equal = [x for x in arr[left:right+1] if x == pivot]
        greater = [x for x in arr[left:right+1] if x > pivot]

        # Reconstruct array section
        arr[left:right+1] = less + equal + greater

        pivot_start = left + len(less)
        pivot_end = left + len(less) + len(equal)

        if k_idx < pivot_start:
            right = pivot_start - 1
        elif k_idx < pivot_end:
            return pivot
        else:
            left = pivot_end
            k_idx -= pivot_end

    return None

The iterative version avoids recursion overhead and stack overflow for large arrays.

Summary

Quickselect is an elegant randomized algorithm for finding the k-th smallest element in expected linear time. It partitions like Quicksort but only recurses on one side. The randomized pivot choice ensures the worst case is astronomically unlikely. Use Quickselect when you need one element by rank — it's much faster than sorting the entire array.

Key takeaways:

Quickselect finds the k-th smallest element in expected O(n) time
Random pivot selection makes worst-case O(n^2) extremely unlikely
Only recurses on one side (unlike Quicksort which does both)
Median finding is the most common application (k = n/2)
Iterative version uses O(1) extra space
Much faster than sorting when you only need one element

What's Next: Randomized Min-Cut

The next chapter presents Karger's randomized algorithm for finding the minimum cut in a graph — a beautiful application of random sampling to graph theory.