System Design Interview Prep — Complete Guide
System design interviews assess your ability to architect large-scale distributed systems — how you handle millions of users, billions of requests, and the tradeoffs between consistency, availability, and performance.
What You’ll Learn
By the end of this tutorial, you’ll have a structured framework for tackling system design questions, understand key concepts like load balancing, caching, database sharding, and CDNs, and be able to design a scalable web service from scratch.
Why System Design Matters
For mid-level and senior engineering roles, system design interviews are often the deciding factor. They test your ability to think at scale, make architectural tradeoffs, and communicate complex ideas clearly. At DodaTech, Doda Browser, DodaZIP, and Durga Antivirus Pro all handle millions of users — our architects use these exact patterns daily.
System Design Learning Path
flowchart LR
A[DSA Review] --> B[System Design]
B --> C{You Are Here}
C --> D[Design a URL Shortener]
C --> E[Design a Chat System]
style C fill:#f90,color:#fff
The System Design Framework — SCAF
Use this four-step framework for every system design question:
1. Scope — Define Requirements
Ask clarifying questions to understand what you’re building:
Functional requirements (what the system must do):
- “What are the core features?”
- “Can users create and read content?”
- “Is this read-heavy or write-heavy?”
Non-functional requirements (quality attributes):
- “How many daily active users?”
- “What’s the expected latency? (P99 < 200ms)”
- “Do we need 99.99% availability?”
Example (URL Shortener):
- Functional: Create short URLs, redirect to original, track clicks
- Non-functional: 100M URLs created/month, 10B redirects/month, P99 latency < 100ms
2. Calculate — Estimate Scale
Make rough calculations to determine system requirements:
| Metric | Daily | Monthly | Per Second |
|---|---|---|---|
| Writes (new URLs) | 3.3M | 100M | ~40 |
| Reads (redirects) | 333M | 10B | ~3,900 |
| Storage (10 years) | — | — | ~36TB |
| Bandwidth | ~50GB | ~1.5TB | ~5MB/s |
Assumptions: Each URL record ~500 bytes, reads are 100x writes.
3. Architect — Design the System
Draw the high-level architecture:
flowchart TD
Client --> LB[Load Balancer]
LB --> WS1[Web Server 1]
LB --> WS2[Web Server 2]
LB --> WS3[Web Server 3]
WS1 --> Cache[(Redis Cache)]
WS2 --> Cache
WS3 --> Cache
WS1 --> DB[(Primary DB)]
WS2 --> DB
WS3 --> DB
DB --> DB_Replica[(Read Replica)]
DB_Replica --> Analytics[Analytics Pipeline]
Key components to discuss:
| Component | Purpose | Example |
|---|---|---|
| Load balancer | Distribute traffic | Nginx, HAProxy, AWS ELB |
| Web servers | Handle requests | Stateless (auto-scale) |
| Cache | Reduce database load | Redis, Memcached |
| Database | Persistent storage | PostgreSQL, Cassandra |
| CDN | Serve static content globally | CloudFront, Cloudflare |
| Message queue | Async processing | Kafka, RabbitMQ |
| Search | Full-text search | Elasticsearch |
4. Fine-Tune — Deep Dive on Components
Pick 2-3 components to discuss in depth:
Database Design
SQL vs NoSQL decision:
| Factor | SQL (PostgreSQL) | NoSQL (Cassandra, MongoDB) |
|---|---|---|
| Schema | Fixed, relational | Flexible, document/graph |
| Scalability | Vertical (read replicas) | Horizontal (sharding) |
| Consistency | Strong ACID | Eventual |
| Best for | Financial data, complex queries | High write throughput, IoT |
Sharding strategies:
- Hash-based: Consistent hashing on key — even distribution, hard to range query
- Range-based: Shard by key range — easy range query, uneven distribution
- Directory-based: Lookup table — flexible, single point of failure
Caching Strategy
Cache frequently accessed data to reduce database load:
# Cache-aside pattern
import redis
import json
cache = redis.Redis(host='localhost', port=6379, db=0)
def get_url(short_code: str) -> str | None:
"""Get original URL from cache or DB."""
# 1. Try cache first
cached = cache.get(f"url:{short_code}")
if cached:
return cached.decode()
# 2. Fall back to database
original_url = database_lookup(short_code)
# 3. Populate cache for next time
if original_url:
cache.setex(f"url:{short_code}", 3600, original_url) # Expire in 1 hour
return original_urlCache eviction policies:
- LRU (Least Recently Used) — evict oldest accessed
- TTL (Time To Live) — evict after fixed time
- LFU (Least Frequently Used) — evict least popular
Common System Design Questions
Design a URL Shortener (like bit.ly)
Core flow: Create (POST /shorten) → store mapping → Redirect (GET /{code}) → 301 redirect
Key design decisions:
- Short code generation: Base62 encoding of a unique ID (62^7 = 3.5 trillion combinations)
- Database: PostgreSQL for the mapping table (id, short_code, original_url, created_at, click_count)
- Caching: Redis cache for hot URLs (LRU + TTL)
- Redirection: 301 (permanent) for cache efficiency at the browser level
- Analytics: Kafka + Spark for click stream processing
Design a Chat System (like WhatsApp or Slack)
Core flow: Send message → store → deliver to recipient(s)
Key design decisions:
- WebSocket for real-time bidirectional communication
- Message queue (Kafka) for reliable delivery
- Database: Cassandra for message storage (high write throughput)
- Presence: Redis with TTL for online/offline status
- Group chats: Fan-out on write (store once, read at recipient)
- Offline delivery: Push notifications via FCM/APNs
Design a News Feed (like Facebook/Twitter)
Core flow: User posts → followers see it → ranked by relevance
Key design decisions:
- Fan-out on write vs Fan-out on read: Twitter uses fan-out on write for celebrities (pre-compute), fan-out on read for regular users
- Ranking: ML-based relevance score (recency, affinity, engagement)
- Media storage: CDN for images/videos
- Caching: Redis sorted sets for timeline
Scalability Concepts
Vertical vs Horizontal Scaling
flowchart LR
subgraph "Vertical Scaling"
A[1 server] --> B[Bigger server]
end
subgraph "Horizontal Scaling"
C[1 server] --> D[10 servers]
end
| Aspect | Vertical | Horizontal |
|---|---|---|
| Approach | Bigger machine | More machines |
| Limit | Hardware cap | Theoretically unlimited |
| Cost | Less than 2x for 2x power | Linear |
| Complexity | None | Significant |
| Fault tolerance | Single point of failure | Redundant |
CAP Theorem
A distributed system can only guarantee two of three:
- Consistency — all nodes see the same data at the same time
- Availability — every request gets a response (not necessarily the latest data)
- Partition tolerance — system continues despite network failures
flowchart TD
CAP[CAP Theorem] --> CP[CP: Consistency + Partition]
CAP --> AP[AP: Availability + Partition]
CAP --> CA[CA: Consistency + Availability]
CA --> Note[Not realistic in distributed systems]
Real-world choices:
- CP systems: Banking, financial transactions (consistency > availability)
- AP systems: Social media feeds, DNS (availability > consistency)
- CA systems: Single-node databases (not distributed)
Load Balancing Algorithms
| Algorithm | How It Works | Best For |
|---|---|---|
| Round Robin | Requests to each server in turn | Equal capacity servers |
| Least Connections | To server with fewest active connections | Uneven request durations |
| IP Hash | Hash client IP to server | Session stickiness |
| Weighted | Based on server capacity | Heterogeneous servers |
Common Design Mistakes
1. Jumping to Details Too Early
Start with the high-level architecture, then dive deep. Going straight to database schema without discussing requirements misses the big picture.
2. Ignoring Tradeoffs
Every design decision has tradeoffs. Acknowledge them: “Using NoSQL gives us write scalability but sacrifices complex queries. For this use case, the tradeoff is acceptable because…”
3. Forgetting About Monitoring and Alerting
Discuss how you’d monitor the system: latency, error rates, traffic patterns. “We’d use Prometheus for metrics and Grafana for dashboards.”
4. Over-Engineering
Don’t suggest Kafka and Kubernetes for a system serving 1,000 users/day. Start simple, discuss when each scaling measure kicks in.
5. Ignoring Security
Mention authentication, rate limiting, HTTPS, input validation. Security awareness is expected at senior levels.
6. Hand-Waving Numbers
Bad: “We need a lot of servers.” Good: “With 4,000 req/s and each server handling 500 req/s, we need about 8 servers. Let’s use 12 for headroom and redundancy.”
7. Not Asking for Feedback
“Does that architecture address your concerns?” — asking shows collaboration and gives you a chance to address the interviewer’s focus area.
Practice Questions
1. What are the three components of the CAP theorem?
Consistency (same data everywhere), Availability (every request gets a response), Partition tolerance (works despite network failures). You can only guarantee two.
2. When would you choose NoSQL over SQL?
High write throughput, flexible schema, horizontal scaling requirements, or when data is document/graph-shaped. Avoid NoSQL when you need complex joins, ACID transactions, or strict schema.
3. What is database sharding and why is it used?
Sharding splits a database across multiple machines by some key (user ID, geographic region). It’s used when a single database can’t handle the write volume or data size.
4. How does a CDN improve system performance?
CDN caches static content (images, CSS, JS) at edge locations worldwide. Users download from the nearest edge, reducing latency by 50-80% and offloading origin servers.
5. Challenge: Design a rate limiter that allows 100 requests per minute per user.
Use a sliding window counter in Redis. Store user->timestamp list. On each request, remove timestamps older than 60 seconds, count remaining. If count >= 100, reject. TTL cleanup with Redis EXPIRE.
Mini Project: System Design Template
# System Design Template
## 1. Requirements
- **Functional**: _________________________________
- **Non-functional**: _________________________________
## 2. Scale Estimates
- DAU: _______________
- QPS (reads): _______________
- QPS (writes): _______________
- Storage (1 year): _______________
## 3. Architecture
[Draw boxes and arrows]
- Clients → Load Balancer → Web Servers → Cache → Database
- CDN for static assets
- Message queue for async tasks
## 4. Database
- SQL vs NoSQL: _______________
- Schema (main tables): _______________
- Sharding key: _______________
- Indexing strategy: _______________
## 5. Key Components
- Cache: _______________
- Search: _______________
- Queue: _______________
- Analytics: _______________
## 6. Tradeoffs
- Consistency vs Availability: _______________
- Read vs Write optimization: _______________
- Cost vs Performance: _______________
## 7. Monitoring
- Metrics: _______________
- Alerts: _______________
- Dashboards: _______________FAQ
Try It Yourself
Pick one of these systems and spend 45 minutes designing it using the SCAF framework:
- Design YouTube — video upload, transcoding, streaming, recommendations
- Design Uber — ride matching, real-time tracking, pricing, payments
- Design Twitter — tweets, timeline, search, trending topics
Write your design on paper or a whiteboard. Record yourself explaining it. This is the exact format DodaTech uses for senior engineering interviews at Doda Browser and Durga Antivirus Pro.
What’s Next
What’s Next
Congratulations on completing this System Design tutorial! Here’s where to go from here:
- Practice daily — Consistency is more important than long study sessions
- Build a project — Apply what you learned by building something real
- Explore related topics — Check out other tutorials in the same category
- Join the community — Discuss with other learners and share your progress
Remember: every expert was once a beginner. Keep coding!
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro