Graph Theory Basics & NetworkX

What is a Graph?

A graph consists of:

  • Nodes (Vertices): entities (cities, people, web pages)
  • Edges: relationships (roads, friendships, links)

Real-life Examples

| Scenario | Nodes | Edges | |----------|-------|-------| | Map Navigation | Intersections | Roads | | Social Network | People | Friendships | | Network Routing | Servers | Cables | | Recommendation | Users/Items | Purchases |

Graph Types

  • Directed Graph: edges have direction
  • Undirected Graph: edges are bidirectional
  • Weighted Graph: edges have weights (distance, time)
  • Connected vs Disconnected

Install NetworkX

pip install networkx matplotlib numpy

Your First Graph

import networkx as nx
import matplotlib.pyplot as plt

G = nx.Graph()
G.add_node("Taipei")
G.add_node("Taichung")
G.add_node("Kaohsiung")

G.add_edge("Taipei", "Taichung", weight=180)
G.add_edge("Taichung", "Kaohsiung", weight=200)

pos = nx.spring_layout(G, seed=42)
nx.draw(G, pos, with_labels=True, node_color='lightblue', node_size=1500)
labels = nx.get_edge_attributes(G, 'weight')
nx.draw_networkx_edge_labels(G, pos, edge_labels=labels)
plt.show()

๐Ÿ”ฅ Vibe Prompt

"Analyze a social network graph: create 50 nodes, compute degree/betweenness centrality, find the top 5 most important nodes, visualize with color intensity."

Common Issues & Solutions

| Problem | Cause | Solution | |---------|-------|----------| | Unexpected results | Wrong parameters | Check defaults and edge cases | | Slow execution | Inefficient algorithm | Use better data structures | | Out of memory | Too much data | Use batch processing | | Hard to debug | No logging | Add detailed logging |

Further Learning

  • Read official documentation
  • Browse open-source examples on GitHub
  • Join community discussions
  • Practice by modifying code and observing results

Performance Considerations

When working with large datasets or complex computations:

  1. Time Complexity: Analyze and optimize Big O
  2. Space Complexity: Balance memory vs speed
  3. Caching: Store computed results to avoid recalculation
  4. Parallelism: Use multi-threading for independent tasks
  5. Profiling: Measure before optimizing - use profilers

Real-World Application

This concept is widely used in:

  • Web development (routing, authentication)
  • Data science (feature engineering, model training)
  • Game development (pathfinding, physics)
  • Mobile apps (state management, caching)

Key Algorithms Summary

| Algorithm | Time Complexity | Space Complexity | Use Case | |-----------|----------------|-----------------|----------| | Dijkstra | O((V+E)logV) | O(V) | Shortest path | | A* | O(E) | O(V) | Pathfinding with heuristic | | BFS | O(V+E) | O(V) | Unweighted shortest path | | DFS | O(V+E) | O(V) | Topological sort, cycle detection |

Choosing the Right Algorithm

  1. Unweighted graph โ†’ BFS
  2. Weighted, non-negative โ†’ Dijkstra
  3. Weighted, any โ†’ Bellman-Ford
  4. Heuristic available โ†’ A*
  5. All pairs โ†’ Floyd-Warshall

Dijkstra's Algorithm

Dijkstra finds the shortest path from a single source node to all other nodes:

import heapq

def dijkstra(graph, start):
    """
    Find shortest distances from start to all nodes using Dijkstra.
    
    Args:
        graph: dict where graph[u] = [(v, weight), ...]
        start: starting node
    
    Returns:
        distances dict, predecessors dict
    """
    distances = {node: float('inf') for node in graph}
    distances[start] = 0
    pq = [(0, start)]
    predecessors = {}
    visited = set()

    while pq:
        current_dist, current = heapq.heappop(pq)

        if current in visited:
            continue

        visited.add(current)

        for neighbor, weight in graph[current]:
            distance = current_dist + weight

            if distance < distances[neighbor]:
                distances[neighbor] = distance
                predecessors[neighbor] = current
                heapq.heappush(pq, (distance, neighbor))

    return distances, predecessors

A* Search Algorithm

A* combines actual cost from start (g) with estimated cost to goal (h) for efficient pathfinding:

def astar(graph, start, goal, heuristic):
    """
    Find shortest path from start to goal using A*.
    
    Args:
        graph: adjacency list
        start: starting node
        goal: target node
        heuristic: function h(node, goal) -> estimated cost
    
    Returns:
        path as list of nodes, or None
    """
    open_set = [(0, start)]  # (f_score, node)
    g_score = {node: float('inf') for node in graph}
    g_score[start] = 0
    f_score = {node: float('inf') for node in graph}
    f_score[start] = heuristic(start, goal)
    came_from = {}
    closed_set = set()

    while open_set:
        _, current = heapq.heappop(open_set)

        if current == goal:
            # Reconstruct path
            path = [current]
            while current in came_from:
                current = came_from[current]
                path.append(current)
            path.reverse()
            return path

        if current in closed_set:
            continue
        closed_set.add(current)

        for neighbor, weight in graph[current]:
            tentative_g = g_score[current] + weight

            if tentative_g < g_score[neighbor]:
                came_from[neighbor] = current
                g_score[neighbor] = tentative_g
                f_score[neighbor] = tentative_g + heuristic(neighbor, goal)
                heapq.heappush(open_set, (f_score[neighbor], neighbor))

    return None  # No path found

# Example: Manhattan heuristic for grid
heuristic = lambda a, b: abs(a[0]-b[0]) + abs(a[1]-b[1])

Summary

Graph theory is the foundation of network analysis, pathfinding, and optimization. Understand node/edge/weight concepts, use adjacency lists for efficiency, choose between DFS/BFS based on your traversal needs, and apply Dijkstra or A* for pathfinding depending on whether you have a goal heuristic.

Key takeaways:

  • Graphs = nodes (vertices) + edges (connections) + optional weights
  • Adjacency list: O(V + E) memory โ€” best for sparse graphs
  • Adjacency matrix: O(Vยฒ) memory โ€” best for dense graphs
  • BFS: shortest path in unweighted graphs, level-order traversal
  • DFS: topological sort, cycle detection, maze solving
  • Dijkstra: shortest paths from one source to all nodes (weighted, non-negative)
  • A*: goal-directed pathfinding with heuristic (faster than Dijkstra)
  • NetworkX: Python library for graph creation, analysis, and visualization

What's Next: Algorithm โ€” Dynamic Programming

The next course covers dynamic programming โ€” breaking problems into overlapping subproblems and solving each once for efficient results.

Member Exclusive Free Tutorial

This chapter is free exclusive content for registered members! Please login or register to unlock immediately.

Login / Register Now