Design YouTube/Netflix: Video Streaming Architecture
Designing a video streaming platform like YouTube or Netflix means building a system that ingests raw video, transcodes it into dozens of formats, stores petabytes of data, and delivers it with sub-second startup to billions of devices worldwide.
What You’ll Learn
You’ll master video transcoding pipelines, adaptive bitrate streaming (HLS/DASH), CDN edge delivery, recommendation system architecture, watch history tracking, and video search at scale.
Why This Problem Matters
YouTube processes 500+ hours of video uploaded every minute. Netflix accounts for 15% of global internet traffic. Video streaming is the most bandwidth-intensive application on the internet, and its architecture is a masterclass in data pipelines, encoding tradeoffs, and global content delivery. At DodaTech, streaming patterns inform video playback optimization in Doda Browser.
Video Upload Pipeline
flowchart LR
Upload[Creator Upload] --> LB[Load Balancer]
LB --> Ingest[Ingestion Service]
Ingest --> Validate[Validation: Format, Size, Virus Scan]
Validate --> Queue[Job Queue - Kafka]
Queue --> T1[Transcode 4K/H.265]
Queue --> T2[Transcode 1080p/H.264]
Queue --> T3[Transcode 720p/VP9]
Queue --> T4[Generate Thumbnails]
Queue --> T5[Generate Captions]
T1 --> Storage[(Object Store - S3)]
T2 --> Storage
T3 --> Storage
T4 --> Storage
Storage --> CDN[CDN Edge]
User[Viewer] --> CDN
Storage --> Meta[(Metadata DB)]
User --> Search[Search Service]
Search --> Index[(Elasticsearch)]
Transcoding Pipeline
Raw video from creators is too large and in incompatible formats for streaming. The transcoding pipeline converts it:
| Step | Input | Output | Worker |
|---|---|---|---|
| Demux | MP4 container | Raw video + audio streams | FFmpeg |
| Encode video | Raw frames | H.264 1080p, H.265 4K, VP9 720p | GPU/CPU farm |
| Encode audio | Raw PCM | AAC stereo, Dolby 5.1, Opus | FFmpeg |
| Package | Encoded streams | HLS/DASH segments + manifests | Packager |
| Thumbnail | Keyframes | Multiple sizes (320×180 to 1920×1080) | Image worker |
| Captions | Audio track | SRT/VTT files | ASR/ML worker |
# Simplified transcoding job
import subprocess, json
def transcode_video(input_path: str, output_dir: str, resolutions: list):
jobs = []
for resolution in resolutions:
height = resolution["height"]
bitrate = resolution["bitrate"]
output = f"{output_dir}/{height}p.mp4"
cmd = [
"ffmpeg", "-i", input_path,
"-vf", f"scale=-2:{height}",
"-c:v", "libx264",
"-b:v", bitrate,
"-c:a", "aac",
"-y", output
]
subprocess.run(cmd, check=True)
jobs.append({"resolution": f"{height}p", "output": output, "bitrate": bitrate})
return jobs
# Simulated run
resolutions = [
{"height": 360, "bitrate": "500k"},
{"height": 720, "bitrate": "2500k"},
{"height": 1080, "bitrate": "5000k"},
]
print("Transcoding jobs generated:")
for job in resolutions:
print(f" {job['height']}p at {job['bitrate']}")Output:
Transcoding jobs generated:
360p at 500k
720p at 2500k
1080p at 5000kAdaptive Bitrate Streaming (HLS/DASH)
Adaptive streaming lets the player switch quality mid-playback based on network conditions. The server prepares multiple quality variants and a manifest file.
HLS (HTTP Live Streaming)
Apple’s HLS splits video into 5-10 second .ts segments. The manifest (.m3u8) lists all variants:
#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=500000,RESOLUTION=640x360
360p.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2500000,RESOLUTION=1280x720
720p.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=5000000,RESOLUTION=1920x1080
1080p.m3u8DASH (Dynamic Adaptive Streaming over HTTP)
DASH uses MPEG segments with an MPD manifest. It’s codec-agnostic and supports more advanced features like ad insertion and multi-angle.
# Generate HLS manifest
def generate_hls_manifest(variants: list, duration_seconds: int) -> str:
lines = ["#EXTM3U"]
for v in variants:
segment_count = duration_seconds // 10
lines.append(f"\n#EXT-X-STREAM-INF:BANDWIDTH={v['bandwidth']},RESOLUTION={v['resolution']}")
lines.append(f"{v['name']}.m3u8")
# Per-variant playlist
lines.append(f"\n#EXTM3U")
lines.append(f"#EXT-X-TARGETDURATION:10")
lines.append(f"#EXT-X-VERSION:3")
lines.append(f"#EXT-X-MEDIA-SEQUENCE:0")
lines.append(f"#EXT-X-PLAYLIST-TYPE:VOD")
for i in range(segment_count):
lines.append(f"#EXTINF:10.0,")
lines.append(f"{v['name']}_segment_{i:04d}.ts")
lines.append("#EXT-X-ENDLIST")
return "\n".join(lines)
manifest = generate_hls_manifest([
{"name": "360p", "bandwidth": 500000, "resolution": "640x360"},
{"name": "720p", "bandwidth": 2500000, "resolution": "1280x720"},
], 60)
print(manifest[:300] + "...")CDN for Video Delivery
Video is the most cache-friendly content — it’s static, large, and accessed frequently (for popular videos):
| CDN Strategy | Benefit | Implementation |
|---|---|---|
| Edge caching | Serve segments from nearest PoP | Open Connect (Netflix), CloudFront |
| Origin shield | Reduce load on origin | Parent cache tier before origin |
| Pre-positioning | Push popular content to edges | ML predicts trending videos |
| P2P delivery | Peers share segments | WebRTC data channels |
Netflix’s Open Connect CDN places dedicated servers inside ISP networks, serving 95%+ of traffic from cache.
Recommendation System
flowchart TB
User[User Watches Video] --> Event[Watch Event]
Event --> Stream[Event Stream - Kafka]
Stream --> Batch[Batch Processor - Spark]
Stream --> Online[Online Predictor - ML]
Batch --> CF[Collaborative Filtering]
Batch --> CB[Content-Based Filtering]
Batch --> Trending[Trending Detector]
CF --> Embeddings[User Embeddings]
CB --> Embeddings
Online --> Rank[Ranking Model]
Trending --> Rank
Embeddings --> Rank
Rank --> Results[Recommended Videos]
Three recommendation approaches work together:
- Collaborative filtering: Users who watched X also watched Y
- Content-based: Similar video categories, tags, descriptions
- Trending: Global and regional popular videos
Search Architecture
Video search uses Elasticsearch with specialized field mappings:
{
"mappings": {
"properties": {
"title": { "type": "text", "boost": 3 },
"description": { "type": "text", "boost": 1.5 },
"tags": { "type": "keyword" },
"category": { "type": "keyword" },
"view_count": { "type": "long" },
"upload_date": { "type": "date" }
}
}
}Search relevance combines text matching with popularity signals (views, watch time, recency).
Common Errors
Fixed segment duration: Using 10s segments for all content. Live streams need shorter segments (2-4s) for lower latency. VOD can use longer segments (6-10s) for better compression.
No multi-codec strategy: Encoding only H.264 misses compression gains from H.265/AV1 (40-50% smaller files). Serve the best codec the client supports.
Hot video overload: When a video goes viral, origin servers get millions of simultaneous segment requests. Pre-position popular content on CDN edges and implement origin shielding.
Transcoding at upload time only: A video in 4:3 aspect ratio gets wrong crops. Always validate aspect ratio and pad (letterbox) instead of stretching.
Storing master files with no backup: Raw studio masters are irreplaceable. Replicate across at least two geographic regions with versioning enabled.
Mini Project
Build a video transcoding orchestrator:
import random, time
class VideoTranscoder:
def __init__(self):
self.jobs = []
self.workers = 4
def submit_job(self, video_id: str, resolutions: list):
job = {
"video_id": video_id,
"resolutions": resolutions,
"status": "queued",
"created_at": time.time(),
}
self.jobs.append(job)
return job
def process(self):
for job in self.jobs:
if job["status"] == "queued":
job["status"] = "processing"
for res in job["resolutions"]:
duration = random.uniform(5, 30)
time.sleep(0.1) # Simulate work
print(f"Transcoded {job['video_id']} to {res['height']}p ({duration:.0f}s)")
job["status"] = "completed"
job["completed_at"] = time.time()
return [j for j in self.jobs if j["status"] == "completed"]
tx = VideoTranscoder()
tx.submit_job("vid-001", [{"height": 360}, {"height": 720}, {"height": 1080}])
tx.submit_job("vid-002", [{"height": 720}, {"height": 2160}])
completed = tx.process()
print(f"Completed {len(completed)} video jobs")Cross-References
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro