title: "Kruskal & Union-Find" description: "Implement Kruskal minimum spanning tree algorithm, and build Union-Find (Disjoint Set) from scratch." order: 1

Kruskal & Union-Find

Union-Find Implementation

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]

    def union(self, x, y):
        px, py = self.find(x), self.find(y)
        if px == py: return False
        if self.rank[px] < self.rank[py]:
            px, py = py, px
        self.parent[py] = px
        if self.rank[px] == self.rank[py]:
            self.rank[px] += 1
        return True

Kruskal MST

def kruskal(n, edges):
    edges.sort(key=lambda x: x[2])
    uf = UnionFind(n)
    mst = []
    total = 0
    for u, v, w in edges:
        if uf.union(u, v):
            mst.append((u, v, w))
            total += w
    return mst, total

Time Complexity

With path compression + union by rank, Union-Find is nearly constant:

| Operation | Unoptimized | +Path Compression | +Rank | |-----------|:-----------:|:----------------:|:----:| | find() | O(n) | O(log n) | O(alpha(n)) | | union() | O(n) | O(log n) | O(alpha(n)) | | Kruskal total | O(m log m) | O(m log n) | O(m log n) |

alpha(n) inverse Ackermann function, <= 4 for all practical n

Kruskal vs Prim

| Aspect | Kruskal | Prim | |--------|:-------:|:----:| | Type | Edge-based | Node-based | | Best for | Sparse (m~n) | Dense (m~n^2) | | Data structure | Union-Find | Priority Queue | | Time | O(m log m) | O(m log n) | | Needs connectivity | No | Yes |

Real Applications of Union-Find

| Application | Problem | Union-Find Role | |-------------|---------|-----------------| | Social networks | A-B-C friends -> same group | Maintain connected components | | Image processing | Find connected same-color pixels | Merge adjacent pixels | | Maze generation | Open walls until start connects end | Detect when connected | | Circuit testing | Do two pins share a net? | Merge connected nodes | | Database merge | Two records same person? | Merge matching IDs |

Next Chapter: Prim Algorithm

Kruskal is edge-centered. Prim is node-centered - starting from a seed, adding the nearest node each step.

Union-Find (Disjoint Set Union)

Union-Find is a data structure that tracks elements partitioned into disjoint sets. It supports two operations:

Find: Determine which set an element belongs to
Union: Merge two sets into one

Implementation

class UnionFind:
    """Union-Find with path compression and union by rank."""
    
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
    
    def find(self, x):
        """Find the root of x with path compression."""
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
    
    def union(self, x, y):
        """Union the sets containing x and y."""
        root_x, root_y = self.find(x), self.find(y)
        if root_x == root_y:
            return False  # Already connected
        # Union by rank: attach shorter tree under taller
        if self.rank[root_x] < self.rank[root_y]:
            self.parent[root_x] = root_y
        elif self.rank[root_x] > self.rank[root_y]:
            self.parent[root_y] = root_x
        else:
            self.parent[root_y] = root_x
            self.rank[root_x] += 1
        return True

Kruskal's Algorithm

def kruskal_mst(n, edges):
    """
    Kruskal's MST algorithm.
    
    Args:
        n: number of vertices
        edges: list of (u, v, weight)
    
    Returns:
        (total_weight, mst_edges)
    """
    uf = UnionFind(n)
    edges.sort(key=lambda e: e[2])  # Sort by weight
    
    total_weight = 0
    mst_edges = []
    
    for u, v, w in edges:
        if uf.union(u, v):
            total_weight += w
            mst_edges.append((u, v, w))
            if len(mst_edges) == n - 1:
                break  # MST complete
    
    return total_weight, mst_edges

# Example
edges = [
    (0, 1, 4), (0, 2, 3),
    (1, 2, 1), (1, 3, 2),
    (2, 3, 5),
]
n = 4

weight, mst = kruskal_mst(n, edges)
print(f"MST Weight: {weight}")
print(f"MST Edges: {mst}")

Time Complexity

| Operation | Complexity | |-----------|-----------| | Sort edges | O(E log E) | | Union operations | O(α(V)) each (inverse Ackermann) | | Find operations | O(α(V)) each | | Total | O(E log E + E α(V)) ≈ O(E log E) |

Applications of Union-Find

| Domain | Use Case | |--------|---------| | Network connectivity | Do two computers belong to the same network? | | Image processing | Find connected pixels (segmentation) | | Maze generation | Connect cells randomly until start reaches end | | Social networks | Find friend groups (connected components) | | Database merge | Determine if two records are related |

Comparison: Kruskal vs. Prim

| Aspect | Kruskal | Prim | |--------|--------|------| | Approach | Edge-based: sort all edges, add smallest | Vertex-based: grow from a seed | | Data Structure | Union-Find | Priority queue (heap) | | Best for | Sparse graphs (few edges) | Dense graphs (many edges) | | Time | O(E log E) | O(E log V) with heap | | Implementation | Medium (Union-Find) | Easy (heap + visited set) |

When Greedy Works

Both Kruskal and Prim are greedy algorithms. At each step, they make the locally optimal choice:

Kruskal: pick the cheapest edge that doesn't create a cycle
Prim: pick the cheapest edge connecting visited to unvisited

The reason greedy works for MST is the cut property: the lightest edge crossing any cut belongs to some MST.

Summary

Kruskal's algorithm finds the Minimum Spanning Tree by sorting edges by weight and adding the cheapest edge that doesn't create a cycle. Union-Find efficiently detects cycles. The greedy approach works because of the cut property of MSTs.

Key takeaways:

Kruskal: sort edges, pick cheapest non-cycling edge
Union-Find: find (path compression) + union (by rank)
Time: O(E log E) dominated by sorting
Kruskal is best for sparse graphs
MST cut property: lightest edge crossing any cut is part of some MST
Greedy works for MST because of optimal substructure
Applications: network design, clustering, circuit design

What's Next: Huffman Coding

The next chapter covers Huffman coding — using character frequency to build optimal variable-length prefix codes for data compression.

title: "Kruskal & Union-Find" description: "Implement Kruskal minimum spanning tree algorithm, and build Union-Find (Disjoint Set) from scratch." order: 1

Kruskal & Union-Find

Union-Find Implementation

Kruskal MST

Time Complexity

Kruskal vs Prim

Real Applications of Union-Find

Next Chapter: Prim Algorithm

Union-Find (Disjoint Set Union)

Implementation

Kruskal's Algorithm

Time Complexity

Applications of Union-Find

Comparison: Kruskal vs. Prim

When Greedy Works

Summary

What's Next: Huffman Coding

Member Exclusive Free Tutorial