System Design Interview: Complete Preparation Guide
System design interviews evaluate your ability to architect large-scale distributed systems. Unlike coding interviews, there’s no single correct answer — interviewers assess your process, tradeoffs, and communication as you design a system from scratch.
Learning Path
flowchart LR
A["DSA Patterns<br/>Foundation"] --> B["System Design<br/>Framework"]
B --> C["Practice Problems<br/>URL Shortener, Chat, Twitter"]
C --> D["Mock Interviews<br/>System Design Focus"]
style B fill:#f90,color:#fff,stroke-width:2px
System Design Framework
A reliable 6-step framework that works for any system design problem:
flowchart TD
A["1. Requirements<br/>Functional + Non-functional"] --> B["2. Estimation<br/>Traffic, Storage, Bandwidth"]
B --> C["3. Data Model<br/>Schema + Storage Choice"]
C --> D["4. High-Level Design<br/>Components + Data Flow"]
D --> E["5. Deep Dive<br/>Bottlenecks + Scaling"]
E --> F["6. Tradeoffs<br/>What you chose and why"]
style A fill:#1a73e8,color:#fff
style B fill:#34a853,color:#fff
style C fill:#fbbc04,color:#333
style D fill:#ea4335,color:#fff
style E fill:#ab47bc,color:#fff
style F fill:#1a73e8,color:#fff
Step 1: Requirements Gathering (5 min)
Always start by clarifying what you’re building. Don’t assume anything.
Functional requirements (what the system does):
- Can users create short URLs?
- Do we need analytics (clicks, geolocation)?
- Should URLs have expiration?
Non-functional requirements (quality attributes):
- Read vs write ratio (typical: 100:1 reads to writes)
- Latency requirements (P99 < 200ms)
- Availability target (99.9% or 99.99%?)
- Consistency vs availability tradeoff
def gather_requirements():
"""Template for collecting system design requirements."""
requirements = {
"functional": [
"Create shortened URLs",
"Redirect to original URL",
"Custom alias support",
"Analytics (click count, referrer)",
"URL expiration",
"REST API for programmatic access",
],
"non_functional": {
"traffic": "100M URLs created/month",
"reads": "10K redirects/second peak",
"latency_p99_ms": 200,
"availability": "99.99%",
"consistency": "Eventual for analytics, strong for redirects"
}
}
return requirements
reqs = gather_requirements()
print("Functional:", len(reqs["functional"]), "items")
print(f"Traffic: {reqs['non_functional']['traffic']}")
print(f"Redirects: {reqs['non_functional']['reads']}/s")Expected output:
Functional: 6 items
Traffic: 100M URLs created/month
Redirects: 10000 reads/s peakStep 2: Estimation (5 min)
Back-of-the-envelope calculations show you can think at scale.
URL Shortener Estimations:
- 100M new URLs/month → ~40 URLs/second writes
- 10K redirects/second reads
- Storage: 100M URLs × 500 bytes ≈ 50 GB/month → 600 GB/year
- Bandwidth: 10K/s × 500 bytes response ≈ 5 MB/s → 15 TB/month
Write throughput: 100,000,000 / (30 × 24 × 3600) ≈ 38 writes/second
Read throughput: 10,000 reads/second (peak)
Read-to-write ratio: 10,000 / 38 ≈ 263:1
Storage per year: 100M × 12 × 500 bytes ≈ 600 GB
Cache required (80% cache hit): 10K × 0.2 × 200 bytes ≈ 4 MB/s cache bandwidthStep 3: Data Model (10 min)
Design the schema and choose storage technology.
-- URL Shortener Schema
CREATE TABLE urls (
short_key VARCHAR(10) PRIMARY KEY,
original_url TEXT NOT NULL,
user_id BIGINT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
expires_at TIMESTAMP,
click_count BIGINT DEFAULT 0
);
CREATE TABLE click_events (
id BIGSERIAL PRIMARY KEY,
short_key VARCHAR(10) REFERENCES urls(short_key),
timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
ip_address INET,
user_agent TEXT,
referrer TEXT
);
CREATE INDEX idx_click_events_short_key ON click_events(short_key);
CREATE INDEX idx_urls_user_id ON urls(user_id);Storage choice:
- Primary DB: PostgreSQL or CockroachDB for strong consistency on redirects
- Cache: Redis for hot URLs (LRU eviction, 80% cache hit target)
- Analytics: Cassandra or ClickHouse for time-series click data
- CDN: CloudFront/Cloudflare for regional edge caching
Step 4: High-Level Design (15 min)
Draw the system architecture and data flow.
flowchart TD
Client["User/Browser"] --> LB["Load Balancer"]
LB --> WS["Web Servers"]
WS --> Cache["Redis Cache"]
WS --> DB[("Primary DB<br/>PostgreSQL")]
WS --> Queue["Message Queue<br/>Kafka"]
Queue --> Analytics["Analytics Workers"]
Analytics --> ADB[("Analytics DB<br/>ClickHouse")]
DB --> Replica[("Read Replicas")]
LB --> CDN["CDN Edge Cache"]
style Client fill:#4285f4,color:#fff
style Cache fill:#ff6d01,color:#fff
style DB fill:#34a853,color:#fff
style Queue fill:#ea4335,color:#fff
Step 5: Deep Dive (15 min)
Focus on the most interesting or bottleneck component.
Key deep dive topics for URL shortener:
Key generation: Base62 encoding of unique IDs. Use Twitter Snowflake or a distributed ID generator:
import time
import threading
class SnowflakeID:
"""Distributed unique ID generator (Twitter Snowflake-style)."""
def __init__(self, worker_id, datacenter_id):
self.worker_id = worker_id
self.datacenter_id = datacenter_id
self.sequence = 0
self.last_timestamp = -1
self.lock = threading.Lock()
def _gen_timestamp(self):
return int(time.time() * 1000)
def get_id(self):
with self.lock:
timestamp = self._gen_timestamp()
if timestamp == self.last_timestamp:
self.sequence = (self.sequence + 1) & 4095
else:
self.sequence = 0
self.last_timestamp = timestamp
id = ((timestamp - 1609459200000) << 22) # custom epoch
id |= (self.datacenter_id << 17)
id |= (self.worker_id << 12)
id |= self.sequence
return id
gen = SnowflakeID(worker_id=1, datacenter_id=1)
print(f"ID: {gen.get_id()}")
print(f"ID: {gen.get_id()}")Expected output:
ID: 1398172634113024
ID: 1398172634113025Base62 encoding (short key from numeric ID):
BASE62 = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
def encode_base62(num):
"""Convert a number to base62 string."""
if num == 0:
return BASE62[0]
result = []
while num > 0:
result.append(BASE62[num % 62])
num //= 62
return ''.join(reversed(result))
def decode_base62(s):
"""Convert base62 string back to number."""
result = 0
for char in s:
result = result * 62 + BASE62.index(char)
return result
# Test
ids = [0, 1, 61, 62, 123456789]
for id in ids:
encoded = encode_base62(id)
decoded = decode_base62(encoded)
print(f"{id:>10} → {encoded:>7} → {decoded}")Expected output:
0 → 0 → 0
1 → 1 → 1
61 → z → 61
62 → 10 → 62
123456789 → 8M0kX → 123456789Step 6: Tradeoffs (5 min)
Every design decision involves tradeoffs. Explicitly discuss them:
| Decision | Alternative | Why You Chose This |
|---|---|---|
| PostgreSQL over NoSQL | Cassandra | Strong consistency needed for redirects |
| Snowflake ID over UUID | Sequential IDs lead to contention | Snowflake is ordered, fits in 64-bit |
| Redis cache over Memcached | Simpler | Persistence, data structures, built-in LRU |
| Monolith over microservices | 40 writes/sec doesn’t need microservices | Simpler ops, lower latency, one codebase |
Practice Problems
Master these 5 problems to cover 90% of system design interviews:
- URL Shortener (e.g., bit.ly) — Key generation, redirection, analytics, caching
- Chat System (e.g., WhatsApp) — WebSockets, message ordering, presence, offline storage
- Social Media Feed (e.g., Twitter) — Fan-out on write vs read, timeline generation, ranking
- Video Streaming (e.g., YouTube) — Upload pipeline, transcoding, CDN, recommendation
- Rate Limiter — Token bucket, sliding window, distributed rate limiting
Scalability Concepts You Must Know
| Concept | Why It Matters |
|---|---|
| Horizontal vs Vertical Scaling | Add more servers vs make servers bigger. Know when each is appropriate. |
| Database Indexing | B-tree for range queries, Hash for point lookups, GIN for full-text. |
| Caching Strategies | Cache aside, read through, write through, write behind. LRU vs LFU eviction. |
| Consistency Models | Strong vs eventual vs causal consistency. CAP theorem tradeoffs. |
| Partitioning (Sharding) | Hash-based, range-based, directory-based. Rebalancing challenges. |
| Replication | Leader-follower, leader-leader, quorum-based reads/writes. |
| Load Balancing | Round robin, least connections, consistent hashing for cache affinity. |
| Message Queues | Kafka for high throughput, RabbitMQ for reliable delivery, SQS for simplicity. |
Common System Design Mistakes
- Jumping to solution without clarifying requirements — Always confirm functional and non-functional requirements first. Building the wrong system is worse than building a system poorly.
- Ignoring storage estimation — You can’t design a system without knowing how much data it needs to store. Calculate storage requirements before choosing a database.
- Single-server mindset — Everything runs on one server. Interviewers want to see distributed thinking — sharding, replication, load balancers, CDNs.
- Not discussing tradeoffs — Every decision has tradeoffs. Saying “we’ll use NoSQL” without explaining why SQL doesn’t fit shows shallow thinking.
- Over-engineering — Adding Kubernetes, microservices, and 15 caches for a system serving 100 users. Start simple and scale up as requirements grow.
- Ignoring failure modes — What happens when the database goes down? When cache is cold? When traffic spikes 10x? Discuss resilience patterns.
- No monitoring or observability — Every production system needs metrics, logging, and alerting. Mention tools like Prometheus, Grafana, and structured logging.
Practice Questions
1. What’s the first step in any system design interview? Clarify requirements — both functional (what the system does) and non-functional (scale, latency, availability, consistency).
2. How do you estimate storage needs for a URL shortener? Calculate: URLs created per month × average storage per URL (500 bytes for key, URL, metadata) × retention period. Add analytics storage separately.
3. When would you choose eventual consistency over strong consistency? For read-heavy systems where freshness isn’t critical (analytics, timelines, recommendations). Never for financial transactions or systems where correctness depends on order.
4. What’s the difference between vertical and horizontal scaling? Vertical = bigger machine (limited, single point of failure). Horizontal = more machines (elastic, fault-tolerant, but more complex).
5. Challenge: Design a real-time leaderboard Design a system that shows the top 100 scores in a multiplayer game with 10M daily active users. Requirements: update scores in real-time (sub-second), handle 100K writes/second, serve 1M reads/second. Discuss your data model, storage choice, caching strategy, and tradeoffs.
Mini Project: System Design Template
from dataclasses import dataclass, field
from typing import List, Dict
@dataclass
class SystemDesign:
"""Template for practicing system design interviews."""
problem: str
requirements: Dict = field(default_factory=dict)
estimation: Dict = field(default_factory=dict)
data_model: str = ""
high_level_design: List[str] = field(default_factory=list)
deep_dive_topics: List[str] = field(default_factory=list)
tradeoffs: List[tuple] = field(default_factory=list)
def present(self):
print(f"\n{'='*50}")
print(f"SYSTEM DESIGN: {self.problem}")
print(f"{'='*50}")
print(f"\n📋 Requirements:")
for k, v in self.requirements.items():
print(f" {k}: {v}")
print(f"\n📊 Estimation:")
for k, v in self.estimation.items():
print(f" {k}: {v}")
print(f"\n🗄️ Data Model:")
print(f" {self.data_model}")
print(f"\n🏗️ High-Level Components:")
for c in self.high_level_design:
print(f" - {c}")
print(f"\n🔍 Deep Dive:")
for t in self.deep_dive_topics:
print(f" - {t}")
print(f"\n⚖️ Tradeoffs:")
for decision, alt, reason in self.tradeoffs:
print(f" {decision} over {alt}: {reason}")
design = SystemDesign(problem="URL Shortener")
design.requirements = {
"functional": ["Create short URLs", "Redirect", "Analytics"],
"non-functional": {"reads": "10K/s", "writes": "40/s", "latency": "<200ms"}
}
design.estimation = {"storage": "600 GB/year", "bandwidth": "15 TB/month"}
design.data_model = "PostgreSQL urls + click_events tables"
design.high_level_design = ["Load Balancer", "Web Servers", "Redis Cache", "PostgreSQL", "Kafka", "ClickHouse"]
design.deep_dive_topics = ["Key generation (Snowflake + Base62)", "Caching strategy", "Database sharding"]
design.tradeoffs = [
("PostgreSQL", "Cassandra", "Strong consistency needed"),
("Snowflake", "UUID", "Ordered, fits in 64-bit"),
("Monolith", "Microservices", "40 writes/sec, simpler ops"),
]
design.present()FAQ
Related Tutorials
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro. Updated 2026-06-20.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro