Learn Computer: Concurrency & Parallelism Explained — Threads, Mutexes & Deadlock Prevention

Q: **Can you use asyncio with threads?**

Yes. Use loop.run_in_executor() to offload blocking code to a thread pool while keeping the main event loop responsive.

Concurrency & Parallelism Explained — Threads, Mutexes & Deadlock Prevention

DodaTech Updated Jun 20, 2026 10 min read

Concurrency is about dealing with many things at once; parallelism is about doing many things at once. Understanding this difference is critical for writing fast, correct software.

What You’ll Learn

In this tutorial, you’ll learn the difference between concurrency and parallelism, how threads and processes work, synchronization primitives like mutexes and semaphores, deadlock prevention strategies, and how modern languages handle concurrency.

Why It Matters

Modern CPUs have multiple cores. Programs that don’t use concurrency leave performance on the table. But concurrent code is hard to get right — race conditions and deadlocks can crash your application or corrupt data.

Real-World Use

Durga Antivirus Pro uses concurrent scan workers to check multiple files simultaneously. A database server handles thousands of concurrent queries using connection pooling and thread-safe transactions. Your web browser loads images, runs JavaScript, and handles user input concurrently.

    graph TD
  subgraph "Concurrency (logical)"
    A1[Task A starts] --> B1[Task B starts]
    B1 --> A2[Task A resumes]
    A2 --> B2[Task B finishes]
  end
  subgraph "Parallelism (physical)"
    C1[Task A] --> D1[Task A done]
    C2[Task B] --> D2[Task B done]
  end

Concurrency vs Parallelism

Concurrency is about structure — multiple tasks making progress in overlapping time periods. On a single-core CPU, threads share time slices.

Parallelism is about execution — multiple tasks actually running simultaneously on different cores.

import threading
import time

def task(name, duration):
    print(f"[{name}] Starting, will run {duration}s")
    time.sleep(duration)
    print(f"[{name}] Done")

# Sequential execution
print("=== Sequential (no concurrency) ===")
start = time.time()
task("A", 2)
task("B", 2)
print(f"Total: {time.time() - start:.1f}s\n")

# Concurrent execution (single-core illusion)
print("=== Concurrent (threading) ===")
start = time.time()
t1 = threading.Thread(target=task, args=("A", 2))
t2 = threading.Thread(target=task, args=("B", 2))
t1.start()
t2.start()
t1.join()
t2.join()
print(f"Total: {time.time() - start:.1f}s")

Expected output:

=== Sequential (no concurrency) ===
[A] Starting, will run 2s
[A] Done
[B] Starting, will run 2s
[B] Done
Total: 4.0s

=== Concurrent (threading) ===
[A] Starting, will run 2s
[B] Starting, will run 2s
[A] Done
[B] Done
Total: 2.0s

Threads vs Processes

Feature	Thread	Process
Memory	Shared within same process	Isolated
Creation	Fast	Slow
Communication	Direct (shared memory)	IPC (pipes, sockets)
Crash impact	Can crash whole process	Independent
Use case	Fine-grained concurrency	Isolation, separate programs

Mutexes and Race Conditions

A race condition happens when two threads access shared data without synchronisation. A mutex (mutual exclusion) ensures only one thread accesses the critical section at a time.

import threading

counter = 0
lock = threading.Lock()

def increment_without_lock():
    global counter
    for _ in range(100000):
        counter += 1  # Not atomic!

def increment_with_lock():
    global counter
    for _ in range(100000):
        with lock:
            counter += 1

# Without lock — race condition
counter = 0
t1 = threading.Thread(target=increment_without_lock)
t2 = threading.Thread(target=increment_without_lock)
t1.start(); t2.start()
t1.join(); t2.join()
print(f"Without lock: {counter} (expected 200000)")

# With lock — correct
counter = 0
t1 = threading.Thread(target=increment_with_lock)
t2 = threading.Thread(target=increment_with_lock)
t1.start(); t2.start()
t1.join(); t2.join()
print(f"With lock: {counter} (expected 200000)")

Expected output:

Without lock: 184721 (expected 200000)
With lock: 200000 (expected 200000)

The exact number without lock varies each run — that’s the race condition.

Semaphores

A semaphore controls access to a limited resource pool. A counting semaphore allows N threads to access a resource; a binary semaphore works like a mutex.

import threading
import time

pool = threading.Semaphore(3)  # Allow 3 concurrent accesses

def access_database(thread_id):
    with pool:
        print(f"[Thread {thread_id}] Accessing database...")
        time.sleep(1)
        print(f"[Thread {thread_id}] Done")

threads = []
for i in range(10):
    t = threading.Thread(target=access_database, args=(i,))
    threads.append(t)

print("Starting 10 threads with pool limit of 3")
for t in threads:
    t.start()
for t in threads:
    t.join()

Expected output:

Starting 10 threads with pool limit of 3
[Thread 0] Accessing database...
[Thread 1] Accessing database...
[Thread 2] Accessing database...
[Thread 0] Done
[Thread 3] Accessing database...
...

At most 3 threads access the resource simultaneously.

Deadlock Prevention

A deadlock occurs when two threads each hold a lock and wait for the other’s lock. Four conditions must hold: mutual exclusion, hold-and-wait, no preemption, circular wait.

Banker’s Algorithm

The banker’s algorithm prevents deadlock by checking if granting a request leaves the system in a safe state.

def is_safe_state(available, allocated, max_need):
    n_processes = len(allocated)
    n_resources = len(available)

    need = [[max_need[i][j] - allocated[i][j]
             for j in range(n_resources)]
            for i in range(n_processes)]

    finished = [False] * n_processes
    safe_sequence = []
    work = available.copy()

    while len(safe_sequence) < n_processes:
        found = False
        for i in range(n_processes):
            if not finished[i] and all(need[i][j] <= work[j]
                                       for j in range(n_resources)):
                for j in range(n_resources):
                    work[j] += allocated[i][j]
                finished[i] = True
                safe_sequence.append(i)
                found = True
        if not found:
            return False, []

    return True, safe_sequence

# Example: 5 processes, 3 resources
available = [3, 3, 2]
allocated = [
    [0, 1, 0],
    [2, 0, 0],
    [3, 0, 2],
    [2, 1, 1],
    [0, 0, 2],
]
max_need = [
    [7, 5, 3],
    [3, 2, 2],
    [9, 0, 2],
    [2, 2, 2],
    [4, 3, 3],
]

safe, sequence = is_safe_state(available, allocated, max_need)
print(f"Safe state: {safe}")
if safe:
    print(f"Safe sequence: P{', P'.join(map(str, sequence))}")

Expected output:

Safe state: True
Safe sequence: P1, P3, P4, P0, P2

Lock Ordering

A simpler deadlock prevention strategy: always acquire locks in the same order across all threads.

import threading
import time

lock_a = threading.Lock()
lock_b = threading.Lock()

def thread_1():
    with lock_a:
        time.sleep(0.1)
        with lock_b:
            print("Thread 1: got both locks")

def thread_2():
    with lock_a:  # Same order as thread_1
        time.sleep(0.1)
        with lock_b:
            print("Thread 2: got both locks")

t1 = threading.Thread(target=thread_1)
t2 = threading.Thread(target=thread_2)
t1.start(); t2.start()
t1.join(); t2.join()
print("No deadlock — lock ordering works!")

If thread_2 acquired lock_b first, they’d deadlock.

The Dining Philosophers Problem

Five philosophers sit at a table with five forks. Each needs two forks to eat. This classic problem demonstrates deadlock and starvation.

import threading
import time
import random

class Philosopher(threading.Thread):
    def __init__(self, index, left_fork, right_fork):
        super().__init__()
        self.index = index
        self.left_fork = left_fork
        self.right_fork = right_fork

    def run(self):
        for _ in range(3):
            self.think()
            self.eat()

    def think(self):
        print(f"Philosopher {self.index} thinking...")
        time.sleep(random.random())

    def eat(self):
        # Acquire forks in order of index to prevent deadlock
        first = self.left_fork if self.index % 2 == 0 else self.right_fork
        second = self.right_fork if self.index % 2 == 0 else self.left_fork

        with first:
            with second:
                print(f"Philosopher {self.index} eating! 🍝")
                time.sleep(random.random())

forks = [threading.Lock() for _ in range(5)]
philosophers = [
    Philosopher(i, forks[i], forks[(i + 1) % 5])
    for i in range(5)
]

for p in philosophers:
    p.start()
for p in philosophers:
    p.join()
print("All philosophers finished!")

The trick: odd-indexed philosophers pick up the right fork first, breaking the circular wait.

Go Goroutines vs OS Threads

Go goroutines are lightweight user-space threads multiplexed onto OS threads by the Go runtime. They start with ~2KB stacks (vs ~1MB for OS threads) and can scale to millions.

package main

import (
    "fmt"
    "sync"
    "time"
)

func worker(id int, wg *sync.WaitGroup) {
    defer wg.Done()
    fmt.Printf("Worker %d starting\n", id)
    time.Sleep(time.Second)
    fmt.Printf("Worker %d done\n", id)
}

func main() {
    var wg sync.WaitGroup

    for i := 1; i <= 5; i++ {
        wg.Add(1)
        go worker(i, &wg)
    }

    wg.Wait()
    fmt.Println("All workers completed")
}

Expected output:

Worker 5 starting
Worker 1 starting
Worker 3 starting
Worker 4 starting
Worker 2 starting
Worker 2 done
Worker 1 done
Worker 5 done
Worker 3 done
Worker 4 done
All workers completed

Async/Await

Async/await enables cooperative concurrency within a single thread. Tasks voluntarily yield control at await points.

import asyncio

async def fetch_data(url, delay):
    print(f"Fetching {url}...")
    await asyncio.sleep(delay)  # Yield control
    data = f"Data from {url}"
    print(f"Got {len(data)} bytes from {url}")
    return data

async def main():
    tasks = [
        fetch_data("https://api.example.com/users", 2),
        fetch_data("https://api.example.com/posts", 1),
        fetch_data("https://api.example.com/comments", 3),
    ]
    results = await asyncio.gather(*tasks)
    for result in results:
        print(f"Result: {result}")

asyncio.run(main())

Expected output:

Fetching https://api.example.com/users...
Fetching https://api.example.com/posts...
Fetching https://api.example.com/comments...
Got 29 bytes from https://api.example.com/posts
Got 29 bytes from https://api.example.com/users
Got 32 bytes from https://api.example.com/comments
Result: Data from https://api.example.com/users
Result: Data from https://api.example.com/posts
Result: Data from https://api.example.com/comments

Lock-Free Data Structures

Lock-free data structures use atomic operations (like CAS — Compare-And-Swap) instead of mutexes. They avoid deadlocks and priority inversion.

import threading
import atomic  # Requires: pip install atomicwrites

# Python's threading provides atomic increment via锁
# but true lock-free structures use CAS
class LockFreeCounter:
    def __init__(self):
        self._value = 0
        self._lock = threading.Lock()

    def increment(self):
        with self._lock:
            self._value += 1
            return self._value

    def get(self):
        return self._value

Real lock-free structures in C++ use std::atomic<int> and compare_exchange_weak. For most application code, well-designed locks are sufficient.

Common Mistakes

1. Not protecting shared state

Accessing a shared list without a lock from multiple threads causes data corruption or crashes.

2. Deadlock from inconsistent lock ordering

Thread A locks X then Y; Thread B locks Y then X → deadlock. Always acquire in a consistent order.

3. Holding locks for too long

Holding a mutex during I/O or network calls blocks other threads. Keep critical sections small.

4. Forgetting to join threads

A program exits before threads finish. Always join threads before the main thread ends.

5. Over-using threads for CPU-bound work

More threads than CPU cores causes context switching overhead. Use os.cpu_count() to limit thread pool size.

6. Using async/await for CPU-heavy code

Async is great for I/O-bound tasks, not CPU-bound work. For CPU-bound, use threads or processes.

Practice Questions

What’s the difference between concurrency and parallelism? Concurrency is about dealing with many tasks at once (structure); parallelism is about executing many tasks at once (hardware).
What four conditions cause deadlock? Mutual exclusion, hold-and-wait, no preemption, circular wait. Break any one to prevent deadlock.
How does the banker’s algorithm prevent deadlock? It checks whether granting a resource request leaves the system in a safe state — one where all processes can eventually complete.
Why are goroutines lighter than OS threads? Goroutines start with ~2KB stacks that grow as needed, while OS threads have ~1MB fixed stacks. Goroutines are multiplexed onto OS threads by the runtime scheduler.
When should you use async/await instead of threads? For I/O-bound work (network, disk, databases) where latency dominates. Use threads or processes for CPU-bound computation.

Challenge

Implement a thread-safe bounded queue (producer-consumer) using a condition variable. The queue should block producers when full and consumers when empty.

Real-World Task

Use Python’s concurrent.futures.ThreadPoolExecutor to download 10 URLs concurrently. Measure the speedup vs sequential downloads. Security angle: malware scanners use this pattern to query multiple threat intelligence feeds in parallel.

FAQ

What is a race condition?

A race condition occurs when multiple threads access shared data concurrently and the result depends on the interleaving of operations. It’s a bug that produces non-deterministic results.

What is a mutex vs a semaphore?

A mutex allows one thread at a time into a critical section. A counting semaphore allows up to N threads. A binary semaphore (count=1) behaves like a mutex, but mutexes have ownership semantics (only the locking thread can unlock).

What is priority inversion?

A high-priority thread waits for a resource held by a low-priority thread, while a medium-priority thread preempts the low-priority thread. Solutions include priority inheritance and priority ceiling protocols.

Can you use asyncio with threads?

Yes. Use loop.run_in_executor() to offload blocking code to a thread pool while keeping the main event loop responsive.

What is the difference between threading and multiprocessing in Python?

Threading shares memory within one process (GIL-limited for CPU work). Multiprocessing spawns separate processes with isolated memory, bypassing the GIL.