Caching Strategies — Write-Through, Write-Around, Write-Back Explained with Examples
Caching is the technique of storing frequently accessed data in a fast storage layer so subsequent requests can be served without querying a slower primary data source, dramatically reducing latency and load.
Why Caching Matters
Every millisecond counts. Amazon found that 100ms of extra latency costs 1% in revenue. A database query takes 10-20ms; a Redis cache lookup takes under 1ms. Effective caching can absorb 90% of read traffic, letting your databases serve the remaining 10% without breaking a sweat. Netflix serves billions of hours of video — almost entirely from cache. Without caching, the internet would collapse under its own weight.
Plain-Language Explanation
Think of a library. The main stacks (your database) hold every book, but walking through aisles to find one takes time. A librarian’s desk (your cache) keeps the 20 most requested books handy. When you ask for a book, the librarian checks the desk first — if it’s there (cache hit), you get it instantly. If not (cache miss), they walk to the stacks, grab it, and maybe leave a copy on the desk for the next person.
This is exactly how a cache works. You check fast storage first. On a miss, you fetch from the slow source, store a copy in cache, and return the result. The challenge is keeping the cache copy fresh — that’s where invalidation strategies come in.
graph LR
Client --> LB[Load Balancer]
LB --> App[Application Server]
App --> Cache[(Redis Cache
~1ms)]
Cache -->|Miss| DB[(Database
~10ms)]
DB -->|Write| Cache
App --> CDN[CDN
Edge Server]
CDN -->|Miss| App
style Cache fill:#f39c12,color:#fff
style DB fill:#c0392b,color:#fff
style CDN fill:#8e44ad,color:#fff
Cache Layers
Browser Cache
The browser stores static assets locally based on Cache-Control headers. No server round-trip needed for returning visitors.
CDN Cache
Edge servers around the world cache static content (images, CSS, JS). A user in Tokyo gets assets from a Tokyo edge server instead of your origin in Virginia. Content Delivery Network explains this in detail.
Application Cache
In-memory data stores like Redis or Memcached sit beside your application server. Session data, API responses, database query results — anything expensive to recompute.
Database Cache
Databases have their own internal caches (buffer pool, query cache). MySQL’s InnoDB buffer pool caches frequently used indexes and data pages.
Cache Invalidation Strategies
Write-Through
Data is written to both cache and database simultaneously. The cache always has fresh data, but writes are slower (two operations).
def write_through(cache, db, key, value):
db.save(key, value) # Save to database first
cache.set(key, value) # Then update cache
return value
def read(cache, db, key):
result = cache.get(key)
if result is None: # Cache miss
result = db.find(key)
cache.set(key, result)
return resultWrite-Around
Data is written only to the database. The cache is populated only on read (cache miss). Simple and efficient for data that’s read infrequently.
def write_around(db, key, value):
db.save(key, value) # Only write to database
def read(cache, db, key):
result = cache.get(key)
if result is None:
result = db.find(key)
cache.set(key, result) # Populate cache on read
return resultWrite-Back (Write-Behind)
Data is written to cache immediately and asynchronously persisted to the database. Very fast writes, but data can be lost if the cache fails before persistence.
import threading, time
def write_back(cache, queue, key, value):
cache.set(key, value)
queue.append((key, value)) # Queue for async persistence
def async_persist(queue, db):
while True:
if queue:
key, value = queue.pop(0)
db.save(key, value)
time.sleep(0.1)
# Start background writer
persist_thread = threading.Thread(target=async_persist, args=(queue, db), daemon=True)
persist_thread.start()Redis Caching Example
import redis
import json
import time
cache = redis.Redis(host='localhost', port=6379, db=0, decode_responses=True)
# Simulate a slow database query
def get_user_from_db(user_id: int) -> dict:
time.sleep(0.5) # Simulate 500ms query
return {"id": user_id, "name": f"User_{user_id}", "email": f"user{user_id}@example.com"}
def get_user(user_id: int) -> dict:
cache_key = f"user:{user_id}"
# Try cache first
cached = cache.get(cache_key)
if cached:
print("CACHE HIT")
return json.loads(cached)
print("CACHE MISS — fetching from database")
user = get_user_from_db(user_id)
# Store in cache with 60-second TTL
cache.setex(cache_key, 60, json.dumps(user))
return user
# First call — cache miss (500ms)
start = time.time()
user1 = get_user(42)
print(f"User: {user1['name']}, Time: {time.time() - start:.2f}s")
# Second call — cache hit (~1ms)
start = time.time()
user2 = get_user(42)
print(f"User: {user2['name']}, Time: {time.time() - start:.3f}s")Expected output:
CACHE MISS — fetching from database
User: User_42, Time: 0.51s
CACHE HIT
User: User_42, Time: 0.001sTTL — Time To Live
Every cache entry should have a TTL. Why? Without it, stale data lives forever. With it, you get eventual consistency for free — the entry expires and the next read fetches fresh data.
Choosing TTL is a tradeoff: short TTL means more database hits but fresher data; long TTL means better cache hit rate but potentially stale data. A good default is 60-300 seconds and adjust based on how frequently the data changes.
Cache Eviction Policies
When the cache is full, something must be removed. Common policies:
LRU (Least Recently Used): Removes entries not accessed the longest. Great for general-purpose caching.
LFU (Least Frequently Used): Removes entries accessed the least often. Better for skewed access patterns.
FIFO (First In, First Out): Removes the oldest entry regardless of usage. Simple but less effective.
Common Mistakes
No TTL on cache entries: Stale data accumulates. Eventually the cache becomes inconsistent with the source of truth. Always set a TTL.
Cache stampede: When a popular cache key expires and thousands of requests simultaneously hit the database. Use mutex locks or probabilistic early expiration to prevent this.
Caching everything: Not all data benefits from caching. Cache data that’s read frequently and written infrequently. Caching volatile data just adds overhead.
Ignoring serialization overhead: JSON serialization/deserialization costs time. For high-throughput paths, consider binary formats (Protocol Buffers, MessagePack).
No monitoring: You can’t optimize what you don’t measure. Track cache hit rate, miss rate, and eviction count. A falling hit rate means your cache strategy needs review.
Practice Questions
What is a cache stampede and how do you prevent it? When many requests miss the cache simultaneously and overload the database. Prevent with mutex locks around cache population, probabilistic early expiration (refresh before TTL expires), or using a hot-standby cache.
Compare write-through and write-back caching. Write-through writes to cache and database synchronously — data is always consistent but writes are slower. Write-back writes to cache first and persists asynchronously — faster writes but risk of data loss on cache failure.
How does TTL provide eventual consistency? When the TTL expires, the cache entry is removed. The next read fetches fresh data from the source, automatically updating the cache. No explicit invalidation logic needed.
When should you use Memcached over Redis? Memcached is simpler and uses less memory per key (no persistence, no data structures). Redis supports persistence, rich data types (lists, sets, sorted sets), and replication. Use Redis when you need data structures or durability.
Why is cache hit ratio not the only metric that matters? A high hit rate on stale data is worse than a lower hit rate on fresh data. Also, caching the wrong data (rarely accessed) wastes memory without improving performance.
Mini Project
Build a rate-limiting version of a Fibonacci calculator that caches computed values:
import redis
import time
cache = redis.Redis(decode_responses=True)
def fib(n: int) -> int:
cache_key = f"fib:{n}"
cached = cache.get(cache_key)
if cached:
return int(cached)
if n <= 1:
result = n
else:
result = fib(n - 1) + fib(n - 2)
cache.setex(cache_key, 300, result)
return result
# Compute Fibonacci numbers with caching
for n in [10, 20, 30, 40]:
start = time.time()
result = fib(n)
elapsed = time.time() - start
print(f"fib({n}) = {result}, computed in {elapsed:.4f}s")
print("\nSecond run — should be instant (cached):")
start = time.time()
result = fib(40)
print(f"fib(40) = {result}, computed in {time.time() - start:.4f}s")Expected output:
fib(10) = 55, computed in 0.0020s
fib(20) = 6765, computed in 0.0050s
fib(30) = 832040, computed in 1.2340s
fib(40) = 102334155, computed in 0.0003s
Second run — should be instant (cached):
fib(40) = 102334155, computed in 0.0002sCross-References
- System Design Overview
- Content Delivery Network
- Load Balancing
- Database Sharding
- Consistency Models
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro