Graph Theory Basics & NetworkX

What is a Graph?

A graph consists of:

Nodes (Vertices): entities (cities, people, web pages)
Edges: relationships (roads, friendships, links)

Real-life Examples

| Scenario | Nodes | Edges | |----------|-------|-------| | Map Navigation | Intersections | Roads | | Social Network | People | Friendships | | Network Routing | Servers | Cables | | Recommendation | Users/Items | Purchases |

Graph Types

Directed Graph: edges have direction
Undirected Graph: edges are bidirectional
Weighted Graph: edges have weights (distance, time)
Connected vs Disconnected

Install NetworkX

pip install networkx matplotlib numpy

Your First Graph

import networkx as nx
import matplotlib.pyplot as plt

G = nx.Graph()
G.add_node("Taipei")
G.add_node("Taichung")
G.add_node("Kaohsiung")

G.add_edge("Taipei", "Taichung", weight=180)
G.add_edge("Taichung", "Kaohsiung", weight=200)

pos = nx.spring_layout(G, seed=42)
nx.draw(G, pos, with_labels=True, node_color='lightblue', node_size=1500)
labels = nx.get_edge_attributes(G, 'weight')
nx.draw_networkx_edge_labels(G, pos, edge_labels=labels)
plt.show()

🔥 Vibe Prompt

"Analyze a social network graph: create 50 nodes, compute degree/betweenness centrality, find the top 5 most important nodes, visualize with color intensity."

Common Issues & Solutions

| Problem | Cause | Solution | |---------|-------|----------| | Unexpected results | Wrong parameters | Check defaults and edge cases | | Slow execution | Inefficient algorithm | Use better data structures | | Out of memory | Too much data | Use batch processing | | Hard to debug | No logging | Add detailed logging |

Further Learning

Read official documentation
Browse open-source examples on GitHub
Join community discussions
Practice by modifying code and observing results

Performance Considerations

When working with large datasets or complex computations:

Time Complexity: Analyze and optimize Big O
Space Complexity: Balance memory vs speed
Caching: Store computed results to avoid recalculation
Parallelism: Use multi-threading for independent tasks
Profiling: Measure before optimizing - use profilers

Real-World Application

This concept is widely used in:

Web development (routing, authentication)
Data science (feature engineering, model training)
Game development (pathfinding, physics)
Mobile apps (state management, caching)

Key Algorithms Summary

| Algorithm | Time Complexity | Space Complexity | Use Case | |-----------|----------------|-----------------|----------| | Dijkstra | O((V+E)logV) | O(V) | Shortest path | | A* | O(E) | O(V) | Pathfinding with heuristic | | BFS | O(V+E) | O(V) | Unweighted shortest path | | DFS | O(V+E) | O(V) | Topological sort, cycle detection |

Choosing the Right Algorithm

Unweighted graph → BFS
Weighted, non-negative → Dijkstra
Weighted, any → Bellman-Ford
Heuristic available → A*
All pairs → Floyd-Warshall

Dijkstra's Algorithm

Dijkstra finds the shortest path from a single source node to all other nodes:

import heapq

def dijkstra(graph, start):
    """
    Find shortest distances from start to all nodes using Dijkstra.
    
    Args:
        graph: dict where graph[u] = [(v, weight), ...]
        start: starting node
    
    Returns:
        distances dict, predecessors dict
    """
    distances = {node: float('inf') for node in graph}
    distances[start] = 0
    pq = [(0, start)]
    predecessors = {}
    visited = set()

    while pq:
        current_dist, current = heapq.heappop(pq)

        if current in visited:
            continue

        visited.add(current)

        for neighbor, weight in graph[current]:
            distance = current_dist + weight

            if distance < distances[neighbor]:
                distances[neighbor] = distance
                predecessors[neighbor] = current
                heapq.heappush(pq, (distance, neighbor))

    return distances, predecessors

A* Search Algorithm

A* combines actual cost from start (g) with estimated cost to goal (h) for efficient pathfinding:

def astar(graph, start, goal, heuristic):
    """
    Find shortest path from start to goal using A*.
    
    Args:
        graph: adjacency list
        start: starting node
        goal: target node
        heuristic: function h(node, goal) -> estimated cost
    
    Returns:
        path as list of nodes, or None
    """
    open_set = [(0, start)]  # (f_score, node)
    g_score = {node: float('inf') for node in graph}
    g_score[start] = 0
    f_score = {node: float('inf') for node in graph}
    f_score[start] = heuristic(start, goal)
    came_from = {}
    closed_set = set()

    while open_set:
        _, current = heapq.heappop(open_set)

        if current == goal:
            # Reconstruct path
            path = [current]
            while current in came_from:
                current = came_from[current]
                path.append(current)
            path.reverse()
            return path

        if current in closed_set:
            continue
        closed_set.add(current)

        for neighbor, weight in graph[current]:
            tentative_g = g_score[current] + weight

            if tentative_g < g_score[neighbor]:
                came_from[neighbor] = current
                g_score[neighbor] = tentative_g
                f_score[neighbor] = tentative_g + heuristic(neighbor, goal)
                heapq.heappush(open_set, (f_score[neighbor], neighbor))

    return None  # No path found

# Example: Manhattan heuristic for grid
heuristic = lambda a, b: abs(a[0]-b[0]) + abs(a[1]-b[1])

Summary

Graph theory is the foundation of network analysis, pathfinding, and optimization. Understand node/edge/weight concepts, use adjacency lists for efficiency, choose between DFS/BFS based on your traversal needs, and apply Dijkstra or A* for pathfinding depending on whether you have a goal heuristic.

Key takeaways:

Graphs = nodes (vertices) + edges (connections) + optional weights
Adjacency list: O(V + E) memory — best for sparse graphs
Adjacency matrix: O(V²) memory — best for dense graphs
BFS: shortest path in unweighted graphs, level-order traversal
DFS: topological sort, cycle detection, maze solving
Dijkstra: shortest paths from one source to all nodes (weighted, non-negative)
A*: goal-directed pathfinding with heuristic (faster than Dijkstra)
NetworkX: Python library for graph creation, analysis, and visualization

What's Next: Algorithm — Dynamic Programming

The next course covers dynamic programming — breaking problems into overlapping subproblems and solving each once for efficient results.