Backend Interview Guide: API Design, Databases, Scalability, and Microservices
Backend engineering interviews assess your ability to design, build, and operate server-side systems — covering API design, database architecture, scalability, distributed systems concepts, and operational excellence.
What You’ll Learn
- REST and GraphQL API design principles and common interview questions
- Database selection: SQL vs NoSQL, indexing, query optimization
- Scalability strategies: horizontal scaling, caching, CDNs, and load balancing
- Microservices architecture: service decomposition, communication patterns
- Distributed systems: consistency models, CAP theorem, message queues
Why Backend Interview Prep Matters
Backend interviews at companies like Google, Stripe, and Uber focus on system-level thinking — how to design APIs that scale, choose the right database, and build systems that handle millions of requests without breaking. These questions test your understanding of tradeoffs between consistency, availability, and performance.
Durga Antivirus Pro processes millions of threat intelligence requests per day — every backend engineer must understand database sharding, caching strategies, and API rate limiting to keep the service responsive.
Learning Path
flowchart LR
A[Coding Interview Prep] --> B[Backend Interview Guide<br/>You are here]
B --> C[System Design]
C --> D[Distributed Systems]
D --> E[Behavioral Prep]
style B fill:#f90,color:#fff
API Design
RESTful API Design
# Good REST API design
GET /api/users # List users
POST /api/users # Create user
GET /api/users/:id # Get user by ID
PUT /api/users/:id # Replace user
PATCH /api/users/:id # Partial update
DELETE /api/users/:id # Delete user
GET /api/users/:id/orders # User's ordersPagination
# Cursor-based pagination (preferred for large datasets)
GET /api/users?cursor=abc123&limit=20
# Response
{
"data": [...],
"next_cursor": "def456",
"has_more": true
}
# Offset-based pagination (simpler, but less reliable)
GET /api/users?page=2&limit=20Rate Limiting
# Token bucket algorithm
class TokenBucket:
def __init__(self, capacity, refill_rate):
self.capacity = capacity
self.tokens = capacity
self.refill_rate = refill_rate # tokens per second
self.last_refill = time.time()
def allow_request(self):
now = time.time()
elapsed = now - self.last_refill
self.tokens = min(
self.capacity,
self.tokens + elapsed * self.refill_rate
)
self.last_refill = now
if self.tokens >= 1:
self.tokens -= 1
return True
return FalseDatabase Design
SQL vs NoSQL
| Factor | SQL | NoSQL |
|---|---|---|
| Schema | Fixed, predefined | Flexible, dynamic |
| Relationships | Foreign keys, JOINs | Embedded docs, references |
| Consistency | ACID guaranteed | Eventual (BASE) |
| Scaling | Vertical (usually) | Horizontal (built-in) |
| Query | Complex JOINs, aggregations | Simple key-based, map-reduce |
| Best for | Structured data, transactions | Unstructured data, high write volume |
Indexing
-- Single column index
CREATE INDEX idx_users_email ON users(email);
-- Composite index — column order matters
CREATE INDEX idx_users_status_created
ON users(status, created_at);
-- Query that uses the composite index
SELECT * FROM users
WHERE status = 'active'
AND created_at > '2024-01-01';
-- Query that CANNOT use the full index
SELECT * FROM users
WHERE created_at > '2024-01-01'
AND status = 'active';N+1 Query Problem
# BAD: N+1 queries — one for the list, N for individual items
users = User.objects.all()
for user in users:
print(user.profile.bio) # Hits DB once per user
# GOOD: Eager loading
users = User.objects.select_related('profile').all()
for user in users:
print(user.profile.bio) # One query with JOINScalability
Horizontal vs Vertical Scaling
flowchart LR
subgraph Vertical
V1[Single Server] --> V2[Bigger Server]
V2 --> V3[Even Bigger Server]
end
subgraph Horizontal
H1[Single Server] --> H2[Server + Server]
H2 --> H3[Server + Server + Server]
end
Caching Strategies
# Cache-aside pattern
def get_user(user_id):
# Try cache first
user = cache.get(f"user:{user_id}")
if user:
return user
# Cache miss — get from database
user = db.query(User).filter_by(id=user_id).first()
# Store in cache
cache.set(f"user:{user_id}", user, ttl=3600)
return user
# Write-through cache
def update_user(user_id, data):
# Write to database
db.query(User).filter_by(id=user_id).update(data)
db.commit()
# Update cache immediately
cache.set(f"user:{user_id}", data, ttl=3600)CDN and Edge Caching
User → CDN Edge (cache hit → serve)
↓ (cache miss)
Origin Server → DatabaseMicroservices
Service Decomposition
Before: Monolith
┌────────────────────────┐
│ User Auth Orders │
│ Payments Notifications│
│ Search Recommendations│
└────────────────────────┘
After: Microservices
User Service ─┐
Auth Service ─┼── API Gateway → Client
Order Service ─┤
Payment Service─┘Communication Patterns
# Synchronous (HTTP/REST) — simple but creates coupling
def create_order(user_id, items):
user = requests.get(f"http://user-service/users/{user_id}")
payment = requests.post(
f"http://payment-service/charges",
json={"user": user_id, "amount": total}
)
return {"order_id": order_id, "status": "created"}
# Asynchronous (Message Queue) — decoupled, resilient
def create_order_async(user_id, items):
message = {"user_id": user_id, "items": items}
queue.publish("orders.create", message)
return {"status": "processing"}Distributed Systems Concepts
CAP Theorem
Consistency (C) — Every read returns the most recent write
Availability (A) — Every request gets a response (not necessarily latest data)
Partition Tolerance (P) — System continues despite network failures
CP systems: Traditional databases (choose consistency over availability)
AP systems: DNS, CDN (choose availability over consistency)
CA systems: Single-node databases (cannot exist in distributed systems)Consistent Hashing
# Consistent hashing for distributing data across nodes
# Minimizes rebalancing when nodes are added or removed
class ConsistentHashRing:
def __init__(self, nodes, replicas=3):
self.replicas = replicas
self.ring = {}
self.sorted_keys = []
for node in nodes:
self.add_node(node)
def add_node(self, node):
for i in range(self.replicas):
key = hash(f"{node}:{i}")
self.ring[key] = node
self.sorted_keys.append(key)
self.sorted_keys.sort()
def get_node(self, key):
if not self.ring:
return None
hash_key = hash(key)
for skey in self.sorted_keys:
if hash_key <= skey:
return self.ring[skey]
return self.ring[self.sorted_keys[0]]Common Backend Interview Mistakes
1. Ignoring Database Design
Focusing only on application logic without considering data modeling.
Fix: Always discuss schema design, indexing strategy, and query patterns.
2. Not Discussing Tradeoffs
Every design decision has tradeoffs. Ignoring them shows inexperience.
Fix: “I’d use a NoSQL database here because we need flexible schemas, but we’ll lose the ability to do complex JOINs.”
3. Over-Engineering
Starting with microservices, Kubernetes, and event sourcing for a simple CRUD app.
Fix: Start simple. Discuss how you’d evolve the system as requirements grow.
4. Ignoring Security
Not mentioning authentication, authorization, input validation, or rate limiting.
Fix: Always discuss security: “I’d add rate limiting here to prevent abuse and validate all inputs.”
5. No Monitoring or Observability
Designing the system but not mentioning how you’d know if it’s working.
Fix: “I’d add structured logging, metrics, and distributed tracing to understand system behavior.”
6. Forgetting Error Handling
Assuming everything always works — network calls succeed, databases respond.
Fix: “I’d implement retry with exponential backoff, circuit breakers, and graceful degradation.”
7. Tight Coupling
Services that depend on each other’s internal data structures or schemas.
Fix: “Services communicate through well-defined APIs. Internal changes don’t affect consumers.”
Practice Questions
1. What is the difference between REST and GraphQL?
REST has fixed endpoints returning fixed data structures. GraphQL has a single endpoint where clients specify exactly what data they need. GraphQL is better for complex data requirements; REST is simpler and more cacheable.
2. When would you choose NoSQL over SQL?
When you need flexible schemas, horizontal scaling, high write throughput, or are storing unstructured data. SQL is better for complex queries, transactions, and data with clear relationships.
3. What is the N+1 query problem and how do you fix it?
Loading a list of entities and then loading a related entity for each one. Fix with eager loading (JOINs, select_related) or batch loading (DataLoader).
4. What is the CAP theorem?
States that distributed systems can only guarantee two of three: Consistency (all nodes see same data), Availability (every request gets a response), Partition Tolerance (system works despite network failures).
5. How would you design a rate limiter?
Use token bucket or sliding window algorithm. Store counters in Redis for distributed rate limiting. Return 429 Too Many Requests when exceeded.
Challenge: Design a URL shortening service. Cover API design, database schema (SQL or NoSQL), caching strategy, rate limiting, and scaling considerations. Discuss tradeoffs at each decision point.
FAQ
What’s Next
| Tutorial | What You’ll Learn |
|---|---|
| System Design Interview Prep | Designing large-scale distributed systems |
| Distributed Systems Guide | Advanced distributed systems concepts |
| Behavioral Interview Prep | Behavioral questions and storytelling |
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro. Updated 2026-06-20.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro