Learn Web: NGINX Performance Tuning — Worker Processes, Buffering, Caching, SSL, and Kernel Optimization

Q: What is the optimal worker_processes value?

auto is best — it matches the number of CPU cores. Each worker handles connections in its own CPU, avoiding context switching between cores.

Q: How do I debug slow upstream responses?

Enable the $upstream_response_time variable in your log format: log_format timed '$remote_addr - $upstream_response_time $upstream_addr';. Times over 1s indicate backend issues.

Q: How often should I rotate the proxy cache?

Set inactive=60m in proxy_cache_path for automatic eviction of stale entries. Use proxy_cache_bypass Cache-Control: no-cache for manual cache purging via API.

Web Servers & Hosting

NGINX Performance Tuning — Worker Processes, Buffering, Caching, SSL, and Kernel Optimization

DodaTech Updated Jun 20, 2026 8 min read

NGINX is known for high performance, but default configurations rarely deliver optimal throughput. This guide covers every tuning dimension: worker process affinity, sendfile and direct I/O, proxy buffering and caching, SSL session caching, OCSP stapling, gzip micro-tuning, connection pooling, and Linux kernel parameters for maximum web server performance.

What You’ll Learn

You’ll tune NGINX worker processes and connections, configure sendfile and aio for zero-copy file serving, optimize proxy buffering and caching layers, accelerate SSL with session caches and OCSP stapling, fine-tune gzip compression levels, and adjust Linux kernel parameters for handling 10000+ concurrent connections. Durga Antivirus Pro uses these tuning techniques to serve signature updates to millions of clients worldwide.

NGINX Performance Tuning

    flowchart LR
  A[NGINX Basics] --> B[Worker Tuning]
  B --> C[File Serving]
  C --> D[Proxy & Buffering]
  D --> E[SSL Optimization]
  E --> F[Kernel Tuning]
  F --> G[NGINX Tuning<br/>You are here]
  style G fill:#f90,color:#fff

Worker Processes and Connections

# /etc/nginx/nginx.conf
user www-data;
worker_processes auto;          # one per CPU core
worker_cpu_affinity auto;       # pin workers to cores
worker_rlimit_nofile 65535;     # max open files per worker

events {
    worker_connections 4096;     # max connections per worker
    multi_accept on;             # accept all new connections at once
    use epoll;                   # Linux 2.6+ efficient polling
}

Calculate Max Connections

# Max concurrent connections = worker_processes × worker_connections
# Example: 8 workers × 4096 connections = 32,768 concurrent connections

# Check current limits
ulimit -n
cat /proc/sys/fs/file-max

Static File Serving

# /etc/nginx/sites-available/static-tuned
server {
    listen 80;
    server_name static.example.com;
    root /var/www/static;
    index index.html;

    # Zero-copy file serving
    sendfile on;
    sendfile_max_chunk 512k;     # per-call limit
    tcp_nopush on;               # optimize packet headers for sendfile
    tcp_nodelay on;              # disable Nagle algorithm

    # Asynchronous I/O for large files
    aio on;
    directio 4m;                 # use direct I/O for files > 4MB
    output_buffers 8 32k;        # 8 buffers of 32KB each
    postpone_output 1460;        # wait for full TCP packet

    # Cache open file descriptors
    open_file_cache max=10000 inactive=60s;
    open_file_cache_valid 120s;
    open_file_cache_min_uses 2;
    open_file_cache_errors off;

    # Browser cache headers
    location ~* \.(css|js|png|jpg|jpeg|gif|ico|svg|webp)$ {
        expires 30d;
        add_header Cache-Control "public, immutable";
        access_log off;           # skip logging for static assets
        log_not_found off;
    }

    location ~* \.(woff2?|eot|ttf|otf)$ {
        expires 365d;
        add_header Cache-Control "public, immutable";
        access_log off;
    }
}

Proxy Buffering and Caching

# /etc/nginx/nginx.conf
# Proxy cache zone
proxy_cache_path /var/cache/nginx levels=1:2
    keys_zone=api_cache:100m
    inactive=60m
    max_size=10g
    use_temp_path=off;

# FastCGI cache for PHP
fastcgi_cache_path /var/cache/nginx/fastcgi levels=1:2
    keys_zone=php_cache:50m
    inactive=30m
    max_size=2g;

server {
    listen 80;
    server_name api.example.com;

    # Proxy buffering
    proxy_buffering on;
    proxy_buffer_size 4k;
    proxy_buffers 8 8k;
    proxy_busy_buffers_size 16k;
    proxy_max_temp_file_size 0;   # avoid disk I/O for temp files

    # Proxy timeouts
    proxy_connect_timeout 10s;
    proxy_send_timeout 30s;
    proxy_read_timeout 30s;

    # Caching
    location /api/ {
        proxy_cache api_cache;
        proxy_cache_key "$scheme$request_method$host$request_uri";
        proxy_cache_valid 200 5m;
        proxy_cache_valid 404 1m;
        proxy_cache_valid any 1m;
        proxy_cache_use_stale error timeout updating;
        proxy_cache_background_update on;
        proxy_cache_lock on;
        proxy_cache_lock_age 5s;

        proxy_pass http://backend:3000;
    }

    location ~ \.php$ {
        fastcgi_cache php_cache;
        fastcgi_cache_key "$scheme$request_method$host$request_uri";
        fastcgi_cache_valid 200 5m;
        fastcgi_pass unix:/var/run/php/php8.3-fpm.sock;
    }
}

Cache Testing

# First request — cache miss
curl -I http://api.example.com/api/users
# X-Proxy-Cache: MISS

# Second request — cache hit
curl -I http://api.example.com/api/users
# X-Proxy-Cache: HIT

# Bypass cache
curl -I http://api.example.com/api/users -H "Cache-Control: no-cache"

SSL Optimization

server {
    listen 443 ssl http2;
    server_name example.com;

    ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;

    # SSL session caching
    ssl_session_cache shared:SSL:50m;   # 50MB ≈ 400,000 sessions
    ssl_session_timeout 4h;
    ssl_session_tickets off;            # better security with shared cache

    # OCSP stapling
    ssl_stapling on;
    ssl_stapling_verify on;
    resolver 8.8.8.8 1.1.1.1 valid=300s;
    resolver_timeout 5s;

    # Modern crypto
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
    ssl_prefer_server_ciphers off;

    # DH params for perfect forward secrecy
    ssl_dhparam /etc/nginx/dhparam.pem;  # openssl dhparam -out /etc/nginx/dhparam.pem 4096

    # HSTS
    add_header Strict-Transport-Security "max-age=63072000" always;
}

Generate DH parameters:

sudo openssl dhparam -out /etc/nginx/dhparam.pem 4096
# This takes several minutes on slower machines

GZip Compression Tuning

# /etc/nginx/nginx.conf
gzip on;
gzip_comp_level 3;              # sweet spot: 3 (not 9)
gzip_min_length 256;            # skip tiny responses
gzip_proxied any;               # compress proxied responses
gzip_vary on;                   # add Vary: Accept-Encoding
gzip_disable "msie6";

gzip_types
    text/plain
    text/css
    text/javascript
    application/javascript
    application/json
    application/xml
    application/x-font-ttf
    image/svg+xml
    text/xml
    application/xhtml+xml;

Compression Benchmark

# Measure gzip effectiveness
curl -s -o /dev/null -w "Size: %{size_download} bytes\n" http://localhost/style.css
curl -s -H "Accept-Encoding: gzip" -o /dev/null \
    -w "Gzipped: %{size_download} bytes\n" http://localhost/style.css

# Expected: ~70-80% reduction for CSS/JS

Linux Kernel Tuning

# /etc/sysctl.d/99-nginx.conf
# --- Network ---

# Max backlog of pending connections
net.core.somaxconn = 65535

# Max number of open files
fs.file-max = 2097152

# TCP socket buffer (min, default, max)
net.core.rmem_default = 65536
net.core.wmem_default = 65536
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216

# TCP auto-tuning buffer
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216

# Enable TCP Fast Open (client + server)
net.ipv4.tcp_fastopen = 3

# Reuse TIME_WAIT sockets
net.ipv4.tcp_tw_reuse = 1

# Reduce keepalive time
net.ipv4.tcp_keepalive_time = 120
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 3

# Increase local port range
net.ipv4.ip_local_port_range = 1024 65535

# Enable TCP congestion control
net.core.default_qdisc = fq
net.ipv4.tcp_congestion_control = bbr

Apply kernel settings:

sudo sysctl -p /etc/sysctl.d/99-nginx.conf

# Verify BBR is active
sysctl net.ipv4.tcp_congestion_control
# net.ipv4.tcp_congestion_control = bbr

Connection Pooling

upstream backend {
    # Keep idle connections to backends
    keepalive 64;
    keepalive_requests 1000;
    keepalive_timeout 60s;

    server 10.0.1.1:3000 max_fails=3 fail_timeout=30s;
    server 10.0.1.2:3000 max_fails=3 fail_timeout=30s;
}

server {
    location / {
        proxy_http_version 1.1;      # required for keepalive
        proxy_set_header Connection "";
        proxy_pass http://backend;
    }
}

Monitoring and Profiling

# NGINX status module
location /nginx_status {
    stub_status on;
    allow 127.0.0.1;
    deny all;
}

# Active connections, accepted, handled, requests
# Active: 25
# server accepts handled requests
#  123456 123456 789012
# Reading: 0 Writing: 5 Waiting: 20

# Real-time monitoring with ngxtop
pip install ngxtop
ngxtop -l /var/log/nginx/access.log

# Top requests, status codes, response times
# Requests summary:
# /api/users      200   150 req/min   avg 45ms
# /api/products   200   120 req/min   avg 62ms

Common Errors

1. worker_connections Exceeds ulimit

worker_rlimit_nofile must equal or exceed worker_connections. Check ulimit -n and set worker_rlimit_nofile 65535 in the main config.

2. SSL Session Cache Overflow

If sessions exceed the cache size, old sessions are evicted early. Monitor with curl -I https://example.com/ and check for full SSL handshakes on repeat connections. Increase ssl_session_cache shared:SSL:100m.

3. Cache Key Collisions

Identical cache keys for different content variants. Include $http_accept_encoding in proxy_cache_key for compressed vs uncompressed variants.

4. sendfile Fails for Unix Sockets

sendfile doesn’t work with Unix domain sockets. Use sendfile off when proxying to Unix sockets (like PHP-FPM). Use TCP sockets for backend proxy if sendfile is critical.

5. Direct I/O Too Low

Setting directio too low (e.g., 4k) causes excessive direct I/O overhead. Tune based on your typical file sizes. Start at 4M for general use.

6. Kernel Connection Table Exhaustion

Monitor with ss -s and cat /proc/net/sockstat. Increase net.ipv4.tcp_max_tw_buckets to avoid time-wait bucket overflow.

7. BBR Congestion Control Not Effective

BBR requires the fq qdisc: net.core.default_qdisc = fq. Without it, BBR falls back to less effective pacing. Verify with tc -s qdisc show dev eth0.

Practice Questions

1. What is the purpose of sendfile in NGINX? sendfile enables zero-copy data transfer between file descriptors, bypassing userspace. The kernel copies data directly from disk to the socket, reducing CPU usage for static file serving.

2. How do you calculate the SSL session cache size? Each SSL session uses approximately 400 bytes. ssl_session_cache shared:SSL:50m stores ~130,000 sessions. Calculate: cache_size / 400 = max sessions.

3. What is OCSP stapling and why does it improve performance? OCSP stapling pre-fetches the certificate revocation status and attaches it to the TLS handshake. Without it, the browser must connect to the CA’s OCSP responder, adding 100-500ms latency.

4. Why is gzip_comp_level 3 recommended over 9? Level 3 provides ~85% of the compression ratio of level 9 at ~30% of the CPU cost. Level 9’s marginal compression gain is not worth the CPU overhead for most workloads.

5. Challenge: Design a caching strategy for an API that has public endpoints (cache 5 min), user-specific endpoints (no cache), and authenticated endpoints (cache 1 min with vary by auth token). Answer: Use proxy_cache_key with $cookie_auth_token for authenticated endpoints. Use proxy_no_cache for user-specific paths. Set proxy_cache_valid differently per location block.

Mini Project: NGINX Performance Benchmark

#!/bin/bash
# benchmark.sh — Run performance benchmarks

NGINX_HOST=${1:-localhost}
CONCURRENCY=${2:-100}
REQUESTS=${3:-10000}

echo "=== NGINX Performance Benchmark ==="
echo "Target: $NGINX_HOST"
echo ""

# 1. Static file benchmark
echo "--- Static File Benchmark ---"
ab -n $REQUESTS -c $CONCURRENCY http://$NGINX_HOST/index.html | \
    grep -E "(Requests per second|Time per request|Transfer rate)"

# 2. Proxy benchmark
echo "--- Proxy Benchmark ---"
ab -n $REQUESTS -c $CONCURRENCY http://$NGINX_HOST/api/health | \
    grep -E "(Requests per second|Time per request)"

# 3. SSL benchmark
echo "--- SSL Benchmark ---"
ab -n 1000 -c 10 https://$NGINX_HOST/ | \
    grep -E "(Requests per second|Time per request)"

# 4. Compression benchmark
echo "--- Compression Benchmark ---"
curl -s -o /dev/null -w "Uncompressed: %{size_download} bytes\n" http://$NGINX_HOST/style.css
curl -s -H "Accept-Encoding: gzip" -o /dev/null \
    -w "Compressed: %{size_download} bytes\n" http://$NGINX_HOST/style.css

# 5. Connection limits
echo "--- Max Connections ---"
ulimit -n
nginx -T 2>/dev/null | grep -E "(worker_processes|worker_connections)"
echo "Max concurrent: $(($(nginx -T 2>/dev/null | grep worker_processes | grep -oP '\d+') * \
    $(nginx -T 2>/dev/null | grep worker_connections | grep -oP '\d+')))"

FAQ

Should I use NGINX or Apache for high-traffic sites?

NGINX is generally better for high concurrency (10000+ connections) due to its event-driven architecture. Apache’s process-per-connection model consumes more memory under load.

What is the optimal worker_processes value?

auto is best — it matches the number of CPU cores. Each worker handles connections in its own CPU, avoiding context switching between cores.

How do I debug slow upstream responses?

Enable the $upstream_response_time variable in your log format: log_format timed '$remote_addr - $upstream_response_time $upstream_addr';. Times over 1s indicate backend issues.

Does HTTP/2 improve performance?

HTTP/2 multiplexes multiple requests over a single connection, reducing connection overhead. Use with SSL (most browsers require HTTP/2 over TLS). Test with curl --http2 -I https://example.com/.

How often should I rotate the proxy cache?

Set inactive=60m in proxy_cache_path for automatic eviction of stale entries. Use proxy_cache_bypass Cache-Control: no-cache for manual cache purging via API.

What is the best way to handle SSL termination at scale?

Offload SSL to NGINX (not backend) to reduce backend CPU usage. Use shared SSL session caches across NGINX instances. Consider hardware SSL accelerators for extreme loads.

What’s Next

Apache Configuration Deep Dive

Web Server Security Hardening

NGINX Basics

Prerequisite

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro. Updated 2026-06-20.

Previous Apache Configuration Deep Dive — Modules, RewriteMap, Custom Logs, Security, and Performance Next Web Hosting Comparison Guide — Shared, VPS, Dedicated, Cloud, Serverless, and PaaS

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Web Servers & Hosting