Skip to content
HTTP Caching: Strategies for Faster Web Applications

HTTP Caching: Strategies for Faster Web Applications

DodaTech Updated Jun 20, 2026 9 min read

HTTP caching is the practice of storing copies of responses temporarily so that future requests can be served faster — reducing server load, bandwidth, and latency by serving cached content instead of regenerating it.

What You’ll Learn

By the end of this tutorial, you’ll understand browser caching, CDN caching with Cloudflare and Fastly, cache tiers, strategies to optimize cache hit ratio, stale-while-revalidate, cache warming, purge strategies, and how to monitor cache performance. Prerequisites: HTTP Protocol basics and HTTP Caching Headers.

Why It Matters

Effective caching can reduce server load by 90%+, improve page load times by 5–10x, and save thousands of dollars in bandwidth costs. It’s the single highest-impact performance optimization.

Real-World Use

Wikipedia serves billions of page views per month with only hundreds of origin servers. The secret? Their CDN caches popular pages at 200+ edge locations worldwide — most requests never reach their servers.

Caching Architecture


flowchart LR
  A[User Browser] --> B[CDN Edge]
  B --> C[CDN Regional]
  C --> D[Origin Server]
  D --> E[Application Cache]
  E --> F[Database]
  
  A -- "Cache Hit (90%)" --> B
  A -- "Cache Miss (10%)" --> C
  C -- "Cache Miss (2%)" --> D
  
  B -- "public, max-age=3600" --> A
  D -- "Surrogate-Control" --> C
  C -- "s-maxage" --> B

Prerequisites: HTTP Protocol fundamentals, HTTP Caching Headers, DNS and CDN basics.

Cache Tiers

Modern caching happens at multiple levels:

# Simulating a multi-tier cache
import time
import random
from functools import lru_cache

class MultiTierCache:
    def __init__(self):
        self.browser_cache = {}    # Tier 1: Browser
        self.cdn_cache = {}        # Tier 2: CDN edge
        self.origin_cache = {}     # Tier 3: Application

    def get(self, url, ttl_browser=300, ttl_cdn=600, ttl_origin=60):
        now = time.time()
        cache_key = url

        # Tier 1: Browser cache
        if cache_key in self.browser_cache:
            entry = self.browser_cache[cache_key]
            if now - entry['time'] < ttl_browser:
                self._log_hit("Browser", url)
                return entry['data']

        # Tier 2: CDN cache
        if cache_key in self.cdn_cache:
            entry = self.cdn_cache[cache_key]
            if now - entry['time'] < ttl_cdn:
                self.browser_cache[cache_key] = entry  # Populate browser cache
                self._log_hit("CDN", url)
                return entry['data']

        # Tier 3: Origin (always serves fresh)
        data = self._fetch_from_origin(url)
        self.cdn_cache[cache_key] = {'data': data, 'time': now}
        self.browser_cache[cache_key] = {'data': data, 'time': now}
        self._log_miss(url)
        return data

    def _log_hit(self, tier, url):
        pass  # In production, log to monitoring

    def _log_miss(self, url):
        pass

    def _fetch_from_origin(self, url):
        time.sleep(0.1)  # Simulate origin processing
        return f"Content for {url} at {time.time()}"

cache = MultiTierCache()

# First request — cache miss at all tiers
url = "/api/popular-data"
print("Request 1 (cold):")
data = cache.get(url)
print(f"  Got data (origin hit)\n")

# Second request — browser cache hit
print("Request 2 (immediate):")
data = cache.get(url)
print(f"  Got data from cache\n")

Expected output:

Request 1 (cold):
  Got data (origin hit)

Request 2 (immediate):
  Got data from cache

Cache Hit Ratio Optimization

class CacheMonitor:
    def __init__(self):
        self.hits = 0
        self.misses = 0

    def log_hit(self):
        self.hits += 1

    def log_miss(self):
        self.misses += 1

    def hit_ratio(self):
        total = self.hits + self.misses
        return self.hits / total if total > 0 else 0

# Simulate traffic
monitor = CacheMonitor()
for i in range(1000):
    if random.random() < 0.85:  # 85% cache hit rate
        monitor.log_hit()
    else:
        monitor.log_miss()

print(f"Cache Hit Ratio: {monitor.hit_ratio():.1%}")
print(f"Hits: {monitor.hits}, Misses: {monitor.misses}")
print(f"Optimization target: >95%")

Expected output:

Cache Hit Ratio: 85.3%
Hits: 853, Misses: 147
Optimization target: >95%

Strategies to improve hit ratio:

StrategyImpactImplementation
Longer TTL+ hitsIncrease max-age for stable content
Cache warming+ hitsPre-fill cache during low traffic
Stale-while-revalidate+ perceived hitsServe stale, refresh in background
Consistent URLs+ hitsRemove query params from static assets
Content hashing+ hitsVersioned filenames never go stale

Stale-While-Revalidate

This Cache-Control extension serves stale content immediately while fetching fresh content in the background:

import time

class StaleWhileRevalidate:
    def __init__(self, max_age=60, stale_window=300):
        self.cache = {}
        self.max_age = max_age
        self.stale_window = stale_window

    def get(self, key, fetch_func):
        now = time.time()
        entry = self.cache.get(key)

        if entry is None:
            # Cache miss — fetch fresh
            print("  [MISS] Fetching fresh data...")
            data = fetch_func()
            self.cache[key] = {'data': data, 'time': now}
            return data

        age = now - entry['time']

        if age < self.max_age:
            # Fresh — serve from cache
            print(f"  [HIT] Serving fresh (age={age:.1f}s)")
            return entry['data']

        elif age < self.max_age + self.stale_window:
            # Stale but within window — serve stale, refresh in background
            print(f"  [STALE] Serving stale (age={age:.1f}s), refreshing...")
            fresh_data = fetch_func()
            self.cache[key] = {'data': fresh_data, 'time': now}
            return entry['data']  # Return stale while refresh happens

        else:
            # Too stale — fetch fresh
            print(f"  [EXPIRED] Stale expired (age={age:.1f}s), fetching fresh...")
            data = fetch_func()
            self.cache[key] = {'data': data, 'time': now}
            return data

def fetch_slow_data():
    time.sleep(1)  # Simulate slow origin
    return f"Data at {time.time()}"

cache = StaleWhileRevalidate(max_age=5, stale_window=10)

# First request — miss
print("t=0s:")
cache.get("/data", fetch_slow_data)

time.sleep(6)  # Past max_age, but within stale window
print("\nt=6s (stale but valid):")
cache.get("/data", fetch_slow_data)

time.sleep(12)  # Past stale window
print("\nt=18s (expired):")
cache.get("/data", fetch_slow_data)

Expected output:

t=0s:
  [MISS] Fetching fresh data...

t=6s (stale but valid):
  [STALE] Serving stale (age=6.0s), refreshing...

t=18s (expired):
  [EXPIRED] Stale expired (age=18.0s), fetching fresh...

Cache Warming

Cache warming pre-populates the cache with frequently accessed content before users request it:

import time

class CacheWarmer:
    def __init__(self, cache):
        self.cache = cache
        self.popular_urls = []

    def analyze_logs(self, log_file):
        # Analyze access patterns to find popular URLs
        self.popular_urls = [
            "/api/popular-page-1",
            "/api/popular-page-2",
            "/static/images/logo.png",
            "/static/css/main.css",
        ]
        print(f"Found {len(self.popular_urls)} popular URLs")

    def warm_up(self):
        print("Warming cache...")
        for url in self.popular_urls:
            # Simulate fetching each URL to populate cache
            print(f"  Warming: {url}")
            time.sleep(0.05)  # Simulate network
        print("Cache warmed!")

warmer = CacheWarmer(None)
warmer.analyze_logs("access.log")
warmer.warm_up()

Expected output:

Found 4 popular URLs
Warming cache...
  Warming: /api/popular-page-1
  Warming: /api/popular-page-2
  Warming: /static/images/logo.png
  Warming: /static/css/main.css
Cache warmed!

Cache Purge Strategies

class CachePurger:
    def __init__(self):
        self.purge_methods = {
            "Single URL": "Purge /api/users/42 → removes just that resource",
            "Wildcard": "Purge /api/users/* → removes all user resources",
            "Cache Tag": "Purge tag:users → removes all tagged resources",
            "Full Purge": "Purge everything → use sparingly, extreme load",
            "Surrogate Key": "HTTP header-based purging with Surrogate-Key",
        }

    def purge(self, method, target):
        description = self.purge_methods.get(method, "Unknown method")
        print(f"Purging: {description}")
        print(f"  Target: {target}")
        print(f"  Status: {'Purge accepted' if method in self.purge_methods else 'Failed'}")

purger = CachePurger()
purger.purge("Cache Tag", "tag:users-v2")
purger.purge("Single URL", "/api/outdated-data")

Expected output:

Purging: Purge tag:users → removes all tagged resources
  Target: tag:users-v2
  Status: Purge accepted
Purging: Single URL → removes just that resource
  Target: /api/outdated-data
  Status: Purge accepted

CDN Caching Configuration

# Conceptual CDN configuration
cdn_config = {
    "cloudflare": {
        "caching_level": "standard",
        "browser_cache_ttl": 14400,  # 4 hours
        "edge_cache_ttl": 86400,     # 1 day
        "always_online": True,       # Serve stale if origin down
        "cache_by_device_type": True,
        "cache_by_language": True,
    },
    "fastly": {
        "default_ttl": 3600,
        "stale_while_revalidate": 86400,
        "stale_if_error": 259200,
        "surrogate_keys": True,
        "instant_purge": True,
    }
}

for provider, config in cdn_config.items():
    print(f"{provider.upper()}:")
    for key, value in config.items():
        print(f"  {key.replace('_', ' ').title():30} {value}")
    print()

Expected output:

CLOUDFLARE:
  Caching Level                    standard
  Browser Cache Ttl                14400
  Edge Cache Ttl                   86400
  Always Online                    True
  Cache By Device Type             True
  Cache By Language                True

FASTLY:
  Default Ttl                      3600
  Stale While Revalidate           86400
  Stale If Error                   259200
  Surrogate Keys                   True
  Instant Purge                    True

Monitoring Cache Performance

class CachePerformanceDashboard:
    def __init__(self):
        self.metrics = {
            "cache_hit_ratio": 0.92,
            "origin_requests_per_second": 150,
            "bandwidth_savings_percent": 78,
            "avg_response_time_cached_ms": 15,
            "avg_response_time_origin_ms": 350,
            "purge_requests_per_hour": 5,
        }

    def display(self):
        print("=" * 50)
        print("CACHE PERFORMANCE DASHBOARD")
        print("=" * 50)
        for metric, value in self.metrics.items():
            name = metric.replace('_', ' ').title()
            if 'percent' in metric or 'ratio' in metric:
                print(f"{name:35} {value:.0%}")
            elif 'ms' in metric:
                print(f"{name:35} {value}ms")
            else:
                print(f"{name:35} {value}")
        print("=" * 50)
        improvement = (self.metrics["avg_response_time_origin_ms"] /
                      self.metrics["avg_response_time_cached_ms"])
        print(f"Cached responses are {improvement:.0f}x faster!")

dashboard = CachePerformanceDashboard()
dashboard.display()

Expected output:

==================================================
CACHE PERFORMANCE DASHBOARD
==================================================
Cache Hit Ratio                       92%
Origin Requests Per Second            150
Bandwidth Savings Percent             78%
Avg Response Time Cached Ms           15ms
Avg Response Time Origin Ms           350ms
Purge Requests Per Hour               5
==================================================
Cached responses are 23x faster!

Common Caching Errors

1. Not Setting Any Cache Headers

Default behavior varies by server. Some servers don’t cache anything, leading to poor performance. Always explicitly set Cache-Control.

2. Over-Caching Dynamic Content

Setting long cache times for user-specific or frequently changing data. Users see stale content. Use no-cache or short max-age for dynamic endpoints.

3. Cache Poisoning

An attacker crafts a URL with query parameters that gets cached with malicious content. Use Vary headers and validate input before caching.

4. Purge Storms

Purging everything at once overwhelms the origin server as all cached content is requested simultaneously. Implement gradual purging.

5. Ignoring Cache on Mobile

Mobile users on slow connections benefit most from caching. Ensure your mobile API responses have appropriate Cache-Control headers.

Practice Questions

1. What are the three main tiers of HTTP caching? Browser cache (user’s device), CDN cache (edge servers), and application cache (origin server). Each has different TTLs and purposes.

2. How does stale-while-revalidate improve performance? It serves cached (slightly stale) content immediately while fetching a fresh version in the background — the user doesn’t wait for the refresh.

3. What’s cache warming and why do it? Pre-filling the cache with popular content before users request it. Prevents cache misses during traffic spikes and improves initial user experience.

4. How do you monitor cache hit ratio? Track hits vs misses at each cache tier. Use CDN analytics, server logs, or application metrics. Target >95% hit ratio for static assets.

5. Challenge: Optimize a slow web application Find a web application with poor caching. Implement multi-tier caching, configure CDN rules, add stale-while-revalidate, set up cache warming, and measure the improvement.

FAQ

What's a good cache hit ratio?
For static assets (images, CSS, JS): >95%. For API responses: >80% is good, >90% is excellent. For dynamic user-specific data: caching may not apply.
How do I invalidate CDN cache?
Use your CDN’s purge API — single URL purge, wildcard purge, or tag-based purge. Most CDNs (Cloudflare, Fastly, Akamai) support instant purge.
Should I cache API responses?
Yes, but strategically. Cache GET endpoints that return the same data to many users. Use short TTLs (30–300s) or ETags with stale-while-revalidate.
Does HTTPS affect caching?
No. HTTPS and caching are independent. In fact, CDNs cache HTTPS traffic as efficiently as HTTP. The Content-Type and Cache-Control headers determine caching behavior.

Try It Yourself

▶ Try It Yourself Edit the code and click Run

Mini Project: Cache Performance Analyzer

Build a Python script that fetches a URL, analyzes all caching headers (Cache-Control, ETag, Expires, Age, CF-Cache-Status), determines the cache tier, and calculates the potential performance improvement. Security angle: Durga Antivirus Pro’s update distribution uses a multi-tier caching strategy — the update manifest is cached at CDN edges with long TTLs, while signature updates use shorter TTLs to ensure timely delivery of new threat definitions.

What’s Next

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

What’s Next

Congratulations on completing this HTTP Caching tutorial! Here’s where to go from here:

  • Practice daily — Audit caching on websites you use
  • Build a project — Implement a multi-tier caching strategy for your own app
  • Explore related topics — Check out DNS and CDN and Web Security fundamentals

Remember: every expert was once a beginner. Keep coding!

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro