Computational Complexity — P vs NP, Reductions & Complexity Classes
Computational complexity classifies problems by how hard they are to solve — not just how long a specific Algorithm takes, but the inherent difficulty of the problem itself.
What You’ll Learn
In this tutorial, you’ll learn time and space complexity notation (Big O, Theta, Omega), the famous P vs NP question, NP-completeness and reductions, amortised analysis, NP-hard problems, approximation algorithms, and the hierarchy of complexity classes.
Why It Matters
Knowing a problem is NP-complete saves you from wasting time trying to find a perfect polynomial-time solution. You instead look for approximations, heuristics, or special-case solutions. This distinction separates tractable from intractable problems in real-world engineering.
Real-World Use
When Google Maps finds the shortest route, it’s solving a problem in P (shortest path). When a logistics company optimises delivery routes, it’s solving the Travelling Salesman Problem — NP-hard. Durga Antivirus Pro uses heuristic pattern matching (an approximation) because exact malware signature matching is computationally infeasible at scale.
graph TD
subgraph "Complexity Class Hierarchy"
P[P - Polynomial]
NP[NP - Nondeterministic Polynomial]
NPC[NP-Complete]
NPH[NP-Hard]
EXP[EXPTIME]
end
P --> NP
NP --> NPC
NPC --> NPH
EXPTIME --> NPH
style P fill:#4f46e5,color:#fff
style NPC fill:#dc2626,color:#fff
Time Complexity: Big O, Theta, Omega
| Notation | Meaning | Example |
|---|---|---|
| O(g(n)) | Upper bound (worst case) | O(n²) means never worse than n² |
| Ω(g(n)) | Lower bound (best case) | Ω(n) means at least n steps |
| Θ(g(n)) | Tight bound | Θ(n log n) means both O and Ω are n log n |
import time
import random
def measure_complexity(n, func, name):
data = list(range(n))
start = time.time()
func(data)
elapsed = time.time() - start
print(f'{name:20s} n={n:6d}: {elapsed:.6f}s')
# O(1) - Constant
def constant(data):
return data[0] if data else None
# O(n) - Linear
def linear(data):
total = 0
for x in data:
total += x
return total
# O(n²) - Quadratic
def quadratic(data):
for i in range(len(data)):
for j in range(len(data)):
_ = data[i] + data[j]
for n in [100, 1000, 10000]:
measure_complexity(n, constant, 'O(1) constant')
measure_complexity(n, linear, 'O(n) linear')
if n <= 1000:
measure_complexity(n, quadratic, 'O(n²) quadratic')
print()Expected output (approximate):
O(1) constant n= 100: 0.000001s
O(n) linear n= 100: 0.000005s
O(n²) quadratic n= 100: 0.000402s
O(1) constant n= 1000: 0.000001s
O(n) linear n= 1000: 0.000048s
O(n²) quadratic n= 1000: 0.041203s
O(1) constant n= 10000: 0.000001s
O(n) linear n= 10000: 0.000481sP vs NP
P (Polynomial time): problems solvable in O(n^k) time on a deterministic machine. Examples: sorting, shortest path, matrix multiplication, linear programming.
NP (Nondeterministic Polynomial time): problems whose solutions can be verified in polynomial time, even if finding them is hard. Examples: Sudoku, SAT, graph colouring, Hamiltonian path.
The P vs NP question: Is P = NP? If yes, every problem whose solution can be quickly verified can also be quickly solved. Most computer scientists believe P ≠ NP.
import itertools
def verify_hamiltonian_path(graph, path):
"""Verify a candidate Hamiltonian path in polynomial time"""
if len(path) != len(graph):
return False
visited = set()
for i in range(len(path) - 1):
if path[i] not in graph or path[i + 1] not in graph[path[i]]:
return False
visited.add(path[i])
visited.add(path[-1])
return len(visited) == len(graph)
def brute_force_hamiltonian(graph):
"""Find Hamiltonian path by trying all permutations (exponential)"""
vertices = list(graph.keys())
for perm in itertools.permutations(vertices):
if verify_hamiltonian_path(graph, perm):
return list(perm)
return None
# Small graph: 5 vertices
graph = {
0: [1, 2],
1: [0, 2, 3],
2: [0, 1, 4],
3: [1, 4],
4: [2, 3],
}
path = brute_force_hamiltonian(graph)
print(f'Hamiltonian path: {path}')
# Verification is fast (O(n)), finding took O(n!)
# Scaling: try with n+1 vertices
graph2 = {i: [(i+1) % 6, (i+2) % 6] for i in range(6)}
path2 = brute_force_hamiltonian(graph2)
print(f'Hamiltonian path (n=6): {path2}')Expected output:
Hamiltonian path: [0, 1, 2, 4, 3]
Hamiltonian path (n=6): [0, 1, 3, 5, 2, 4]NP-Completeness and Reductions
A problem is NP-complete if:
- It is in NP (solutions verifiable in polynomial time)
- Every problem in NP can be reduced to it in polynomial time
Reduction: transform instances of problem A to instances of problem B such that a solution to B gives a solution to A. If A reduces to B and B is in P, then A is in P too.
SAT (Boolean Satisfiability)
SAT was the first problem proven NP-complete (Cook-Levin theorem). Given a boolean formula, is there an assignment that makes it true?
def sat_solver_bruteforce(formula, variables):
"""Try all assignments — exponential, but verification is polynomial"""
for bits in range(1 << len(variables)):
assignment = {}
for i, var in enumerate(variables):
assignment[var] = bool((bits >> i) & 1)
if eval(formula, {}, assignment):
return assignment
return None
# Simple SAT instance: (x AND y) OR (NOT x AND z)
formula = "(x and y) or (not x and z)"
result = sat_solver_bruteforce(formula, ['x', 'y', 'z'])
print(f'SAT result: {result}')
# UNSAT: (x) AND (NOT x)
formula2 = "x and not x"
result2 = sat_solver_bruteforce(formula2, ['x'])
print(f'SAT result: {result2}')3-SAT
3-SAT restricts SAT to clauses with exactly 3 literals. Despite the restriction, 3-SAT is still NP-complete. It’s the canonical problem used in most NP-completeness reductions.
Vertex Cover
Given a graph, find the smallest set of vertices that covers all edges. Reducing Vertex Cover to SAT:
def vertex_cover_to_sat(graph, k):
"""Reduce k-Vertex Cover to SAT (simplified)"""
clauses = []
vertices = list(graph.keys())
# At least one vertex per edge
for u in vertices:
for v in graph.get(u, []):
if u < v: # Avoid duplicates
clauses.append(f'(v{u} or v{v})')
# At most k vertices selected (simplified cardinality constraint)
# Full encoding uses sequential counter or sorting networks
return ' and '.join(f'({c})' for c in clauses)
graph = {0: [1, 2], 1: [0, 2, 3], 2: [0, 1, 4], 3: [1, 4], 4: [2, 3]}
sat_instance = vertex_cover_to_sat(graph, 2)
print(f'SAT encoding of Vertex Cover (k=2):')
print(sat_instance[:150] + '...')Amortised Analysis
Amortised analysis averages the cost of operations over a sequence, even if individual operations are expensive.
Dynamic Array (ArrayList)
Appending to a dynamic array is O(1) amortised: each O(n) resize happens after n O(1) appends.
class DynamicArray:
def __init__(self):
self.capacity = 1
self.size = 0
self.data = [None] * self.capacity
def append(self, value):
if self.size == self.capacity:
# Double the array — O(n)
new_data = [None] * (self.capacity * 2)
for i in range(self.size):
new_data[i] = self.data[i]
self.data = new_data
self.capacity *= 2
print(f' Resize to {self.capacity}')
self.data[self.size] = value
self.size += 1
def total_cost(self, n):
return sum(1 for _ in range(n)
if self.size == self.capacity or True)
arr = DynamicArray()
print('Appending 16 elements (resizes shown):')
for i in range(16):
arr.append(i)
print(f'Final capacity: {arr.capacity}, size: {arr.size}')
print(f'Amortised cost: O(1) — resizes are O(n) but rare')Expected output:
Appending 16 elements (resizes shown):
Resize to 2
Resize to 4
Resize to 8
Resize to 16
Final capacity: 16, size: 16
Amortised cost: O(1) — resizes are O(n) but rareNP-Hard Problems
NP-hard problems are at least as hard as NP-complete problems, but they may not be in NP (solutions may not be verifiable in polynomial time).
| Problem | Type | Description |
|---|---|---|
| Halting problem | Undecidable | Cannot be solved by any algorithm |
| Travelling Salesman (optimisation) | NP-hard | Find shortest tour of cities |
| Knapsack | NP-hard (weak) | Maximise value with weight limit |
| Graph isomorphism | NP-intermediate? | Are two graphs structurally identical? |
| Integer programming | NP-hard | Linear programming with integer constraints |
Approximation Algorithms
For NP-hard problems, we settle for near-optimal solutions with guaranteed bounds.
def approximate_tsp(graph):
"""Nearest neighbour heuristic for TSP (2-approximation)"""
start = list(graph.keys())[0]
visited = {start}
path = [start]
current = start
while len(visited) < len(graph):
nearest = min(
(city for city in graph[current]
if city not in visited),
key=lambda c: graph[current][c]
)
visited.add(nearest)
path.append(nearest)
current = nearest
path.append(start) # Return to start
return path
# City graph: distances between 5 cities
cities = {
'A': {'B': 10, 'C': 15, 'D': 20, 'E': 25},
'B': {'A': 10, 'C': 35, 'D': 25, 'E': 20},
'C': {'A': 15, 'B': 35, 'D': 30, 'E': 10},
'D': {'A': 20, 'B': 25, 'C': 30, 'E': 40},
'E': {'A': 25, 'B': 20, 'C': 10, 'D': 40},
}
tour = approximate_tsp(cities)
print(f'Approximate TSP tour: {" → ".join(tour)}')
total = sum(cities[tour[i]][tour[i+1]]
for i in range(len(tour)-1))
print(f'Total distance: {total}')Complexity Classes Hierarchy
EXPTIME ── problems solvable in exponential time
↑
PSPACE ── problems solvable with polynomial memory
↑
NP ── problems verifiable in polynomial time
↑
P ── problems solvable in polynomial time
↑
NC ── problems solvable in polylogarithmic parallel time
↑
L ── problems solvable with logarithmic spaceCommon Mistakes
1. Equating “can’t find a fast algorithm” with “it’s NP-hard”
Just because you can’t find a polynomial algorithm doesn’t mean no one else can. NP-hardness requires a proof via reduction from a known NP-hard problem.
2. Confusing O and Θ
O is an upper bound; Θ is a tight bound. “Quicksort is O(n²)” is true (n² is an upper bound) but misleading. “Quicksort is Θ(n²) worst-case” is precise.
3. Thinking P problems are always fast
O(n¹⁰⁰) is polynomial but impractical. Not all polynomial algorithms are efficient.
4. Forgetting space complexity
A linear memory algorithm is worse than a constant-memory one, even with the same time complexity. Especially important in embedded systems.
5. Applying worst-case analysis to average-case algorithms
Quicksort is O(n²) worst-case but O(n log n) average. Use randomised analysis for algorithms with probabilistic guarantees.
6. Ignoring problem size
O(2^n) with n=20 is ~1 million operations (fine). O(2^n) with n=100 is ~10³⁰ (impossible). Problem scale matters.
Practice Questions
What is the P vs NP question? Whether every problem whose solution can be verified quickly (NP) can also be solved quickly (P). Most believe P ≠ NP.
What makes a problem NP-complete? It must be in NP (verifiable in polynomial time) and NP-hard (every NP problem reduces to it in polynomial time).
What is a reduction? A polynomial-time transformation of one problem’s instances to another’s, such that the second problem’s solution solves the first.
Why amortised analysis matters? It gives a realistic cost for sequences of operations, accounting for rare expensive operations by spreading their cost across many cheap ones.
What is the time complexity of binary search? Θ(log n) — it halves the search space each step.
Challenge
Prove that the Clique problem (find a fully connected subgraph of size k) is NP-complete by reducing from 3-SAT. Then implement a brute-force clique finder.
Real-World Task
For a real-world problem (find the cheapest way to connect 10 cities with roads), decide: is this in P (minimum spanning tree), or is it NP-hard (TSP)? What algorithm solves it?
FAQ
Mini Project: NP-Complete Problem Solver
Build a tool that:
- Encodes a simple 3-SAT instance from a user-provided formula
- Attempts to solve it with a brute-force SAT solver
- Visualises the runtime growth as the number of variables increases
- Identifies when the problem becomes intractable (>30 variables)
Security angle: Cryptographic systems like RSA rely on the presumed hardness of integer factorisation (not known to be NP-complete, but believed hard). Durga Antivirus Pro uses heuristic analysis to detect malware because exact signature matching is NP-hard at scale.
What’s Next
Before moving on, you should understand:
- Big O, Θ, and Ω notation and their meanings
- The P vs NP problem and why it matters
- How reductions prove NP-completeness
- The difference between NP-complete and NP-hard
- Amortised analysis and approximation algorithms
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro