Skip to content
CDN Deep Dive — Edge Servers, Cache Invalidation, and Geo-Routing Explained

CDN Deep Dive — Edge Servers, Cache Invalidation, and Geo-Routing Explained

DodaTech Updated Jun 20, 2026 6 min read

A Content Delivery Network (CDN) is a globally distributed network of proxy servers that cache and deliver content from locations closest to end users, reducing latency by 50-80% and offloading origin traffic. In this deep dive, you’ll move beyond basic CDN setup to understand edge architecture, cache invalidation strategies, geo-routing mechanics, and how to choose between CloudFront, Cloudflare, and Akamai.

Why CDN Architecture Matters

Netflix accounts for 15% of global internet traffic — almost entirely delivered through its Open Connect CDN. A poorly configured CDN can mean 2-second load times for a Tokyo user hitting a Virginia origin. At DodaTech, CDN delivery patterns optimize asset loading in Doda Browser and ensure fast antivirus definition updates for Durga Antivirus Pro. Understanding CDN internals lets you design systems that feel local no matter where users are.

Edge Server Architecture

A CDN edge server is a reverse proxy running a high-performance HTTP cache (typically NGINX, Varnish, or a custom implementation). When a user requests content, the edge server checks its local cache. On a hit, it serves instantly. On a miss, it fetches from the origin or a parent tier.

    graph TD
    User[User Browser] -->|DNS lookup| CDN_DNS[CDN DNS]
    CDN_DNS -->|Returns nearest edge IP| Edge[Edge Server]
    Edge -->|Cache hit| User
    Edge -->|Cache miss| Parent[Parent Tier]
    Parent -->|Miss| Origin[Origin Server]
    Parent --> Edge
    Origin --> Parent
    Edge -->|Cache miss| Origin
    style Edge fill:#8e44ad,color:#fff
    style Parent fill:#3498db,color:#fff
    style Origin fill:#c0392b,color:#fff
  

Edge servers use a tiered caching hierarchy: L1 edges near users → L2 regional parents → origin. This prevents a stampede of L1 misses from hitting the origin simultaneously.

Origin Pull vs Push

Origin pull is the default mode. The CDN requests content from your origin on the first cache miss. Subsequent requests for the same resource are served from the edge. This is simple to set up but means the first user to request a resource experiences origin latency.

Origin push proactively uploads content to edge servers before any user requests it. This eliminates the cold-start penalty and gives you control over what gets cached and when.

# Origin pull configuration (AWS CloudFront + S3)
import boto3

client = boto3.client('cloudfront')

response = client.create_distribution(
    DistributionConfig={
        'Enabled': True,
        'Origins': {
            'Quantity': 1,
            'Items': [{
                'Id': 's3-origin',
                'DomainName': 'my-bucket.s3.amazonaws.com',
                'S3OriginConfig': {'OriginAccessIdentity': ''},
            }]
        },
        'DefaultCacheBehavior': {
            'TargetOriginId': 's3-origin',
            'ViewerProtocolPolicy': 'redirect-to-https',
            'DefaultTTL': 86400,
            'MaxTTL': 604800,
        },
        'PriceClass': 'PriceClass_100',
    }
)
print(f"CDN domain: {response['Distribution']['DomainName']}")

Cache Invalidation Strategies

Invalidating cached content is one of the hardest CDN problems. Four main approaches:

Versioned filenames — the gold standard. styles.a1b2c3.css never changes content. When you update, deploy styles.d4e5f6.css. The old URL naturally expires from cache. No invalidation needed.

TTL-based expiration — set Cache-Control: max-age=3600. Content automatically expires after 1 hour. The next request fetches fresh content. Simple but you can’t force immediate updates.

API-based purge — call the CDN provider’s API to invalidate specific paths. CloudFront supports CreateInvalidation. Works but has rate limits and propagation delays (minutes to hours).

Origin response with Cache-Control: no-cache — the origin tells the CDN to revalidate on every request using If-Modified-Since or ETag. Bandwidth-saving without staleness.

# API-based cache purge for CloudFront
def purge_cache(distribution_id: str, paths: list):
    client.create_invalidation(
        DistributionId=distribution_id,
        InvalidationBatch={
            'Paths': {'Quantity': len(paths), 'Items': paths},
            'CallerReference': str(time.time()),
        }
    )
    print(f"Purge initiated for {len(paths)} paths")

CDN for Dynamic Content

CDNs traditionally serve static assets, but modern providers also accelerate dynamic content. Cloudflare Workers and AWS Lambda@Edge let you run code at edge locations. Dynamic content acceleration optimizes the TCP connection from edge to origin — using keep-alive, TCP optimization, and route optimization over the CDN’s private backbone.

# NGINX cache configuration for API responses
location /api/ {
    proxy_pass http://origin-server;
    proxy_cache my_cache;
    proxy_cache_valid 200 60s;     # Cache 200 responses for 60s
    proxy_cache_key "$scheme$request_method$host$request_uri";
    add_header X-Cache-Status $upstream_cache_status;
}

Geo-Routing and Anycast

CDNs use two routing techniques:

Anycast — the same IP address is advertised from multiple edge locations worldwide. BGP routes each user to the nearest location automatically. Simple, fast, and fault-tolerant.

DNS-based routing — the CDN’s authoritative DNS server returns different IP addresses based on the requester’s geographic location (GeoDNS). More flexible but subject to DNS caching and latency.

CDN Provider Comparison

FeatureCloudFrontCloudflareAkamai
Edge locations450+310+4,100+
PricingPay-as-you-goFree tier availableEnterprise contract
Dynamic contentLambda@EdgeWorkersEdgeWorkers
DDoS protectionAWS ShieldBuilt-in (up to Tbps)Proactive throttling
Custom SSLFree (via ACM)Free (Universal SSL)Custom certs
Origin typesAny (S3, ALB, HTTP)Any (HTTP, Argo)Any

DDoS Protection at the Edge

CDNs absorb DDoS attacks by distributing traffic across thousands of edge servers. The origin is never directly exposed. Cloudflare has mitigated 2+ Tbps attacks. Key mechanisms:

  • Rate limiting per IP at the edge before traffic reaches origin
  • WAF rules to filter malicious patterns (SQL injection, XSS)
  • Challenge pages (CAPTCHA, JavaScript challenge) for suspicious requests
  • Connection limiting to prevent resource exhaustion
# Cloudflare WAF rule via API
import requests

rule_payload = {
    "description": "Block SQL injection",
    "expression": '(http.request.uri contains "union select")',
    "action": "block",
    "priority": 1
}
headers = {"Authorization": f"Bearer {CF_API_TOKEN}", "Content-Type": "application/json"}
resp = requests.post(
    f"https://api.cloudflare.com/client/v4/zones/{ZONE_ID}/rulesets",
    json=rule_payload,
    headers=headers
)
print(f"WAF rule deployed: {resp.status_code}")

Common Errors

  1. Caching dynamic user-specific content: Personalized dashboards, shopping carts, and account settings must not be cached at the CDN. Use Cache-Control: private or set Cookies in the cache key to vary per user.

  2. No origin redundancy: If your CDN points to a single origin server and it goes down, all edge requests fail. Use multiple origins with failover or a load balancer behind the CDN.

  3. TTL too short for static assets: Setting max-age=60 on versioned CSS/JS means users re-download on every page load. Set immutable assets to max-age=31536000 (1 year).

  4. Ignoring cache hit ratio: A 30% cache hit ratio means 70% of requests hit the origin. Monitor this metric in CDN analytics and tune your caching rules.

  5. Cold start stampede: When a popular video is uploaded and not pre-cached, the first wave of users all trigger cache misses simultaneously. Use origin push for anticipated traffic.

  6. Not using origin shield: A parent cache tier consolidates L1 edge misses before they reach origin. Without it, 1000 edges all missing simultaneously send 1000 requests to origin.

  7. Misconfigured cross-origin requests: CDN-hosted fonts or scripts blocked by CORS. Set Access-Control-Allow-Origin: * on CDN responses for public assets.

Practice Questions

1. How does anycast routing work in CDNs?
The same IP address is announced from multiple edge locations. BGP routes each user to the nearest advertising router. If one edge fails, BGP automatically reroutes traffic.
2. What is the difference between origin pull and origin push?
Pull fetches content on demand at first request. Push proactively uploads content before requests. Pull is simpler; push eliminates cold-start latency.
3. How do you handle cache invalidation for an urgent content update?
Use versioned filenames for static content (update URL = new cache entry). For urgent changes, use the CDN’s purge API. Combine with short TTLs for non-immutable content.
4. Challenge: Design a multi-CDN strategy.
Route 70% of traffic to CloudFront and 30% to Cloudflare. Use DNS-based failover. If CloudFront’s error rate exceeds 1%, shift all traffic to Cloudflare. Monitor per-CDN latency metrics.

Mini Project

Build a CDN performance comparison tool:

import urllib.request
import time
import statistics

def measure_latency(url: str, samples: int = 5) -> dict:
    times = []
    for _ in range(samples):
        start = time.time()
        urllib.request.urlopen(url, timeout=5)
        times.append(time.time() - start)
    return {
        "min": min(times),
        "max": max(times),
        "avg": statistics.mean(times),
        "median": statistics.median(times)
    }

# Compare direct origin vs CDN
direct = measure_latency("https://origin.example.com/asset.jpg")
cdn = measure_latency("https://cdn.example.com/asset.jpg")

print(f"Direct origin: avg={direct['avg']:.3f}s, median={direct['median']:.3f}s")
print(f"CDN:          avg={cdn['avg']:.3f}s, median={cdn['median']:.3f}s")
print(f"Speedup:      {direct['avg'] / cdn['avg']:.1f}x")

Expected output (varies by location):

Direct origin: avg=0.342s, median=0.338s
CDN:          avg=0.045s, median=0.042s
Speedup:      7.6x

Cross-References

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro