Processes and Threads — Operating System Fundamentals

Why Processes and Threads Matter

Every program you run becomes either a process or a thread (or both). Understanding how the operating system manages these execution units is fundamental to writing efficient, stable, and scalable software. Whether you are debugging a slow server, optimizing a data pipeline, or designing a multi-user application, knowing processes and threads is essential.

Why this matters for your career:

  • Concurrency and parallelism are core topics in every backend engineering interview
  • Understanding processes vs. threads helps you design performant applications
  • Debugging issues like race conditions, deadlocks, and memory leaks requires this knowledge
  • Multi-threaded programming is expected for systems-level and backend roles

What Are Processes and Threads?

Process

A process is an independent program in execution. Each process has its own memory space, file descriptors, and execution context. Processes are isolated from each other — one process cannot directly access another process's memory.

Key characteristics:

  • Independent memory space (heap, stack, data segments)
  • Own process ID (PID)
  • Own file descriptors and system resources
  • Communication via Inter-Process Communication (IPC): pipes, sockets, shared memory
  • Heavyweight — creating a process requires significant OS overhead

Thread

A thread is the smallest unit of execution within a process. Threads share the same memory space and resources of their parent process. Multiple threads can run concurrently within a single process.

Key characteristics:

  • Shared memory space with other threads in the same process
  • Own stack (local variables) but shared heap
  • Lightweight — creating a thread is much faster than creating a process
  • Communication via shared memory (no IPC needed)
  • Requires synchronization mechanisms (mutexes, semaphores) to avoid conflicts

Comparison Table

| Feature | Process | Thread | |---------|---------|--------| | Memory space | Independent | Shared with process | | Creation overhead | High (OS must allocate memory, copy context) | Low (shares existing memory) | | Isolation | Fully isolated | Not isolated — one thread can crash the process | | Communication | IPC (pipes, sockets, shared memory) | Direct shared memory access | | Context switch time | Slow (milliseconds) | Fast (microseconds) | | Resource ownership | Owns all resources | Shares process resources | | Failure impact | Only that process fails | Can crash entire process | | Programming complexity | Lower (no shared state issues) | Higher (race conditions, deadlocks) |

Concurrency vs. Parallelism

Concurrency

Concurrency is about dealing with many things at once. Multiple tasks make progress in overlapping time periods, but only one task executes at any instant on a single-core CPU.

Parallelism

Parallelism is about doing many things at once. Multiple tasks execute simultaneously on multiple CPU cores.

| Concept | Analogy | Hardware Requirement | |---------|---------|---------------------| | Concurrency | One person juggling multiple balls | Single core (time-slicing) | | Parallelism | Multiple people each juggling their own ball | Multiple cores |

You can have concurrency without parallelism (single-core CPU with multitasking). You need parallelism to truly execute threads simultaneously.

Practical Examples

Python: Creating Processes

import multiprocessing
import os

def worker(name):
    print(f"Worker {name} (PID: {os.getpid()})")

if __name__ == "__main__":
    processes = []
    for i in range(4):
        p = multiprocessing.Process(target=worker, args=(i,))
        processes.append(p)
        p.start()

    for p in processes:
        p.join()

    print("All processes completed")

Python: Creating Threads

import threading
import time

def worker(name, delay):
    print(f"Thread {name} starting")
    time.sleep(delay)
    print(f"Thread {name} finished after {delay}s")

threads = []
for i in range(4):
    t = threading.Thread(target=worker, args=(i, 1))
    threads.append(t)
    t.start()

for t in threads:
    t.join()

print("All threads completed")

Thread Synchronization with a Lock

import threading

counter = 0
lock = threading.Lock()

def increment(amount):
    global counter
    for _ in range(amount):
        with lock:  # prevents race condition
            counter += 1

threads = []
for _ in range(10):
    t = threading.Thread(target=increment, args=(100000,))
    threads.append(t)
    t.start()

for t in threads:
    t.join()

print(f"Final counter: {counter} (expected: 1,000,000)")

Without the lock, the final counter would be less than 1,000,000 due to race conditions.

When to Use Processes vs. Threads

| Scenario | Best Choice | Reason | |----------|-------------|--------| | CPU-intensive computation (video encoding) | Processes | Avoids GIL in Python, better CPU utilization | | I/O-bound tasks (web scraping, file I/O) | Threads | Lightweight, good for waiting on I/O | | Need memory isolation and safety | Processes | One process crash doesn't affect others | | High-frequency shared data access | Threads | Shared memory avoids IPC overhead | | Multiple users on a server | Processes (or both) | Isolation between users is important | | Microservices architecture | Processes | Each service runs as independent process |

Common Pitfalls

| Pitfall | Description | Solution | |---------|-------------|----------| | Race condition | Multiple threads access shared data without synchronization | Use locks, mutexes, or atomic operations | | Deadlock | Two threads each wait for a lock held by the other | Always acquire locks in the same order | | Thread starvation | A high-priority thread prevents lower-priority ones from running | Use fair scheduling or priority aging | | Memory leak in process | Process allocates memory but never frees it | Monitor memory, restart processes periodically | | Fork bomb | Uncontrolled process creation exhausts system resources | Set ulimit on max processes |

Summary

Processes and threads are the fundamental execution units in any operating system. Processes are isolated, heavyweight, and secure. Threads are lightweight, shared-memory, and efficient but require careful synchronization. Choose processes for isolation and CPU-bound work; choose threads for I/O-bound work and shared-state applications.

Key takeaways:

  • Processes have independent memory; threads share memory within a process
  • Creating processes is slower than creating threads
  • Concurrency ≠ parallelism (concurrency is about structure, parallelism is about execution)
  • Thread synchronization (locks, mutexes) prevents race conditions
  • Python's GIL limits threads for CPU-bound work — use multiprocessing instead
  • Use processes for isolation and security; use threads for efficiency and shared state

What's Next: Memory Management

The next chapter covers memory management — stack vs. heap, virtual memory, paging, and how modern operating systems manage RAM efficiently.

Real-World Scenarios

| Scenario | Process or Thread? | Why | |----------|-------------------|-----| | Web server handling 1000 concurrent requests | Thread pool (or async I/O) | Each request is lightweight I/O work | | Running multiple unrelated applications (Chrome + VS Code) | Processes | Need isolation and security | | Video transcoding service | Processes | CPU-bound, benefits from multi-core parallelism | | Database server handling queries | Threads (or processes) | Both approaches used (PostgreSQL uses processes, MySQL uses threads) | | Browser tabs | Processes (modern browsers) | One crash doesn't lose all tabs | | Real-time data processing pipeline | Processes | Each stage runs independently, communicates via queues |

Understanding when to use each is a key skill that separates junior from senior developers.

Process States

An OS process goes through several states during its lifetime:

NEW → READY → RUNNING → WAITING → TERMINATED
         ↑_____________|  (time slice expired)
               (context switch)
  • NEW: Process is being created
  • READY: Ready to run, waiting for CPU time
  • RUNNING: Currently executing on CPU
  • WAITING: Blocked on I/O or waiting for a resource
  • TERMINATED: Finished execution

Context switching between processes is expensive because the OS must save and restore the entire CPU state including registers, memory mappings, and file descriptors. Thread context switching within the same process is much cheaper since the memory mappings are shared.

Member Exclusive Free Tutorial

This chapter is free exclusive content for registered members! Please login or register to unlock immediately.

Login / Register Now