Learn Process Management in Linux — ps, top, htop, kill, nice, cgroups, systemd Units

Q: How do I see threads within a process?

Use ps -eLf to show threads, or top -H for per-thread view. Each thread appears as a separate entry with a unique TID.

Q: Can I limit network bandwidth per process?

Yes — use tc (traffic control) with cgroup classifiers, or use trickle for simple bandwidth shaping.

Q: How do I trace system calls a process makes?

Use strace -p PID to trace system calls in real time, or strace -f -o trace.log command to trace a command with children.

Linux Administration

Process Management in Linux — ps, top, htop, kill, nice, cgroups, systemd Units

DodaTech Updated Jun 20, 2026 11 min read

Linux process management is the skill of monitoring, controlling, and optimizing running programs. From viewing process lists with ps to setting resource limits with cgroups, this guide covers everything a system administrator needs to manage processes in production.

What You’ll Learn

You’ll learn to inspect processes with ps, top, and htop, control them with kill signals (SIGTERM, SIGKILL, SIGHUP), adjust priority with nice and renice, isolate resources with cgroups v2, and manage service lifecycles with systemd units. You’ll also see how DodaZIP uses cgroups to limit compression worker CPU usage and how Durga Antivirus Pro uses process monitoring for threat detection.

Why Process Management Matters

A runaway process consuming 100% CPU, a memory leak eating all RAM, or a zombie process filling the process table can bring down a server. Process management tools let you identify and resolve these issues before they cause outages. In multi-tenant systems, cgroups ensure noisy neighbors don’t affect other services.

Learning Path

    flowchart LR
  A[File Permissions] --> B[Process Management<br/>You are here]
  B --> C[Systemd Service Management]
  C --> D[Monitoring & Logging]
  D --> E[Security Hardening]
  style B fill:#f90,color:#fff

Viewing Processes with ps

The ps command provides a snapshot of running processes. It has two syntax styles — Unix and BSD.

# BSD style (most common)
ps aux                  # All processes with user, CPU, memory
ps aux --sort=-%cpu     # Sorted by CPU descending
ps aux --sort=-%mem     # Sorted by memory descending

# Unix style
ps -ef                  # Full format listing
ps -eLf                 # Show threads (LWP)
ps -eo pid,pcpu,pmem,comm --sort=-pcpu | head

# Tree view
ps auxf                 # Forest view — parent-child relationships
ps -ejH                 # Alternative tree view

# Filter by process name
ps aux | grep nginx
pgrep -la nginx         # Find PIDs by name
pstree -p               # Full process tree

Expected output for ps aux --sort=-%cpu | head -5:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root      1234 45.2  2.1 123456 78901 ?        Ssl  10:00  12:34 /usr/bin/node app.js
alice     5678 12.3  0.5  45678 12345 ?        S    10:05   3:21 python scan.py
root       789  5.6  1.2  23456 34567 ?        Ss   10:00   1:45 /usr/sbin/nginx
bob       9012  2.1  8.9 345678 234567 ?       Sl   09:55  15:30 /usr/bin/java -jar app.jar

Understanding Process States

State	Meaning
`R`	Running or runnable (on run queue)
`S`	Interruptible sleep (waiting for event)
`D`	Uninterruptible sleep (usually I/O)
`Z`	Zombie — terminated, waiting for parent
`T`	Stopped (by job control signal)
`I`	Idle kernel thread

Interactive Process Monitoring

top — Built-in System Monitor

# Basic usage
top

# Batch mode (for scripts)
top -b -n 1

# Monitor specific PID
top -p 1234 -p 5678

# Sort by memory on startup
top -o %MEM

Inside top, press:

1 — Toggle per-CPU view
M — Sort by memory
P — Sort by CPU
k — Kill a process (enter PID + signal)
r — Renice a process
u — Show processes for a user
H — Toggle thread view
c — Show full command path
q — Quit

Expected top output:

top - 10:05:23 up 15 days,  3:45,  2 users,  load average: 0.45, 0.78, 0.92
Tasks: 123 total,   1 running, 122 sleeping,   0 stopped,   0 zombie
%Cpu(s): 12.5 us,  3.2 sy,  0.0 ni, 84.0 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
MiB Mem :  15984.1 total,   2345.6 free,   6789.0 used,   6849.5 buff/cache
MiB Swap:   2048.0 total,   2048.0 free,      0.0 used.   7123.4 avail Mem

htop — Enhanced Process Viewer

Htop provides color-coded output, mouse support, and easier navigation:

# Install
sudo apt install htop

# Basic usage
htop

# Tree view (default)
htup -t

# Filter by process name
htop -p 1234

Htop key shortcuts:

F3 — Search process
F4 — Filter
F5 — Tree view
F6 — Sort by column
F9 — Kill process
F10 — Quit

Comparing top and htop

Feature	top	htop
Color	Minimal	Full color
Mouse support	No	Yes
Tree view	Press V	Press F5
Vertical scroll	No	Yes
Process filtering	Press u/O	Press F4
Kill process	Press k	Press F9

Process Control with kill

Signals are the primary mechanism for controlling processes:

# Common signals
kill -15 1234            # SIGTERM — graceful shutdown (default)
kill -9 1234             # SIGKILL — force kill (cannot be caught)
kill -1 1234             # SIGHUP — reload configuration
kill -2 1234             # SIGINT — interrupt (like Ctrl+C)
kill -3 1234             # SIGQUIT — quit with core dump

# Kill by name
pkill -f "node app.js"   # Kill processes matching pattern
killall nginx            # Kill all nginx processes

# Check if process exists
kill -0 1234             # Returns 0 if process exists, 1 if not

Signal Reference

Signal	Number	Action	Use Case
SIGHUP	1	Reload	Config reload without restart
SIGINT	2	Interrupt	Ctrl+C
SIGQUIT	3	Quit + core dump	Debugging crashes
SIGKILL	9	Force kill	Stuck processes
SIGTERM	15	Graceful stop	Default shutdown
SIGSTOP	19	Pause	Freeze process
SIGCONT	18	Resume	Unfreeze stopped process

Process Priority with nice and renice

Linux schedules processes based on priority. The nice value ranges from -20 (highest priority) to +19 (lowest priority). Default is 0.

# Start a process with lower priority
nice -n 10 ./build.sh
nice -10 ./build.sh             # Shorthand

# Start with higher priority (requires root)
sudo nice --20 ./critical-task

# Change priority of a running process
renice -n 5 -p 1234
renice -n -5 -p 5678            # Root only

# Renice all processes for a user
renice -n 10 -u alice

# View nice values
ps -eo pid,ni,comm --sort=ni

Expected output for ps -eo pid,ni,comm --sort=ni | head:

  PID  NI COMMAND
 1234 -20 critical-task
  789   0 nginx
 5678   5 build.sh
 9012  10 backup-script

When to Use nice

Batch jobs: Set nice +10 to avoid impacting interactive users
Compilation: nice -20 make -j$(nproc) during off-hours
Backups: nice -19 rsync ... — lowest priority
Real-time services: Leave at 0 (nice doesn’t guarantee real-time)

Cgroups — Control Groups v2

Cgroups (control groups) limit and isolate resource usage (CPU, memory, I/O) for groups of processes. Cgroups v2 is the default in modern distributions.

# Check cgroups version
stat -fc %T /sys/fs/cgroup/
# cgroup2fs → v2

# List controllers
cat /sys/fs/cgroup/cgroup.controllers
# cpu io memory pids

Creating and Managing Cgroups

# Create a child cgroup
sudo mkdir /sys/fs/cgroup/myapp

# Set memory limit (500 MB)
echo 500000000 | sudo tee /sys/fs/cgroup/myapp/memory.max

# Set CPU limit (50% of one core)
echo 50000 | sudo tee /sys/fs/cgroup/myapp/cpu.max
# Format: quota period (50000µs out of 100000µs)

# Add a process to the cgroup
echo 1234 | sudo tee /sys/fs/cgroup/myapp/cgroup.procs

# Monitor usage
cat /sys/fs/cgroup/myapp/memory.current
cat /sys/fs/cgroup/myapp/cpu.stat

Cgroup Resource Limits via systemd

# Via systemd-run (one-shot command)
sudo systemd-run --unit=limited-build --scope -p MemoryMax=500M -p CPUQuota=50% ./build.sh

# Via service unit
sudo mkdir /etc/systemd/system/myapp.service.d/
sudo tee /etc/systemd/system/myapp.service.d/limits.conf <<EOF
[Service]
MemoryMax=1G
CPUQuota=200%
IOWeight=10
TasksMax=100
EOF

sudo systemctl daemon-reload

Zombie and Orphan Processes

Zombie Processes

A zombie is a terminated process whose parent hasn’t called wait() to read its exit status. Zombies show as state Z in ps.

# Find zombie processes
ps aux | grep 'Z'
ps -eo pid,stat,comm | grep '^ *[0-9]* Z'

# Count zombies
top -bn1 | grep zombie

Zombies can’t be killed — they’re already dead. You must:

Kill the parent process (which reaps children)
Or send SIGCHLD to force the parent to call wait()

Orphan Processes

When a parent dies, children become orphans and are adopted by init (PID 1). Systemd handles this correctly in modern Linux.

Common Process Management Mistakes

1. Using kill -9 as First Resort

SIGKILL doesn’t let processes clean up — open files stay open, temp files remain, databases may corrupt. Always try SIGTERM first, wait a few seconds, then escalate to SIGKILL.

2. Ignoring Zombie Processes

A few zombies are normal. Thousands of zombies fill the process table and prevent new processes from starting. Monitor zombie count in production.

3. Running CPU-Intensive Tasks at Default Priority

Unthrottled build jobs, backup scripts, and batch processes can starve production services. Use nice and cgroups to reserve resources for critical services.

4. Not Setting Memory Limits

A single process with a memory leak can cause the OOM killer to terminate important services. Use cgroups memory.max to limit memory per process group.

5. Confusing CPU Load with CPU Usage

High load average doesn’t always mean high CPU usage — processes waiting on I/O (disk, network) also contribute to load. Check both load and CPU idle percentage.

6. Killing All Processes of a Name with killall

killall node kills ALL node processes, including production services and unrelated user processes. Use pkill -f with a more specific pattern.

7. Not Monitoring OOM Killer Events

Check dmesg | grep -i "killed process" regularly. The OOM killer is a symptom, not a solution.

Practice Questions

1. What’s the difference between SIGTERM and SIGKILL? SIGTERM (15) asks a process to terminate gracefully — it can clean up resources. SIGKILL (9) forces termination immediately and cannot be caught or ignored.

2. How do you find the top 5 memory-consuming processes? ps aux --sort=-%mem | head -6 or top -o %MEM -b -n 1 | head -8.

3. What’s a zombie process and how do you fix it? A zombie is a terminated process whose parent hasn’t read its exit status. Fix by killing the parent (which reaps children) or restarting the parent process.

4. How do cgroups help in multi-tenant systems? Cgroups limit resource usage per process group, preventing one tenant from consuming all CPU, memory, or I/O and affecting others.

5. Challenge: A production server has a process that uses 8GB of memory but should only use 2GB. Your application is restarted by systemd on failure. Without modifying the application code, implement a solution that prevents memory overuse. Answer: Create a systemd drop-in with MemoryMax=2G. Systemd applies the cgroup limit, and if the process exceeds 2GB, the OOM killer terminates it. Systemd’s Restart=on-failure then automatically restarts the service.

Mini Project: Process Monitor Script

Create a script that watches critical processes and alerts on anomalies:

#!/bin/bash
# proc_monitor.sh — Monitor critical processes
# Usage: ./proc_monitor.sh <process_name>

PROC_NAME="${1:-nginx}"
CPU_THRESHOLD=80
MEM_THRESHOLD=80
ALERT_LOG="/var/log/proc_monitor.log"

echo "=== Process Monitor ==="
echo "Target: $PROC_NAME"
echo "CPU threshold: ${CPU_THRESHOLD}%"
echo "Memory threshold: ${MEM_THRESHOLD}%"
echo ""

# Get all PIDs for the process
PIDS=$(pgrep -d',' -x "$PROC_NAME")

if [ -z "$PIDS" ]; then
    echo "ALERT: Process '$PROC_NAME' is NOT RUNNING!" | tee -a "$ALERT_LOG"
    exit 2
fi

# Check CPU usage per instance
echo "--- CPU Check ---"
ps -p "$PIDS" -o pid,%cpu,comm --no-headers | while read pid cpu comm; do
    if (( $(echo "$cpu > $CPU_THRESHOLD" | bc -l) )); then
        echo "ALERT: PID $pid ($comm) CPU: ${cpu}%" | tee -a "$ALERT_LOG"
    else
        echo "OK: PID $pid CPU: ${cpu}%"
    fi
done

# Check memory usage
echo ""
echo "--- Memory Check ---"
total_mem=$(free -m | awk '/Mem:/ {print $2}')
ps -p "$PIDS" -o pid,%mem,rss,comm --no-headers | while read pid mem rss comm; do
    if (( $(echo "$mem > $MEM_THRESHOLD" | bc -l) )); then
        echo "ALERT: PID $pid ($comm) MEM: ${mem}%" | tee -a "$ALERT_LOG"
    else
        rss_mb=$((rss / 1024))
        echo "OK: PID $pid MEM: ${mem}% (${rss_mb}MB)"
    fi
done

# Check for zombie processes
echo ""
echo "--- Zombie Check ---"
zombies=$(ps aux | awk '{if ($8 == "Z") print}')
if [ -n "$zombies" ]; then
    echo "WARNING: Zombie processes detected:" | tee -a "$ALERT_LOG"
    echo "$zombies" | tee -a "$ALERT_LOG"
else
    echo "No zombie processes"
fi

Expected output:

=== Process Monitor ===
Target: nginx
CPU threshold: 80%
Memory threshold: 80%
--- CPU Check ---
OK: PID 789 CPU: 2.3%
OK: PID 790 CPU: 1.8%
--- Memory Check ---
OK: PID 789 MEM: 1.2% (45MB)
OK: PID 790 MEM: 0.8% (32MB)
--- Zombie Check ---
No zombie processes

Durga Antivirus Pro uses a similar monitoring loop to detect suspicious process behavior — unexpected CPU spikes, memory anomalies, or unauthorized child processes.

FAQ

What’s the difference between CPU usage and load average?

CPU usage is the percentage of time the CPU is busy. Load average is the number of processes waiting to run (runnable + uninterruptible sleep). High load with low CPU indicates I/O wait.

How do I see threads within a process?

Use ps -eLf to show threads, or top -H for per-thread view. Each thread appears as a separate entry with a unique TID.

What happens when a process uses too much memory?

Linux’s OOM killer selects and terminates a process to free memory. Which process is killed depends on an oom_score based on size, priority, and runtime.

Can I limit network bandwidth per process?

Yes — use tc (traffic control) with cgroup classifiers, or use trickle for simple bandwidth shaping.

Why does my process show high SHR (shared memory)?

SHR includes shared libraries, shared memory segments, and memory-mapped files. It’s shared across processes, so the actual unique memory usage is RSS - SHR.

How do I trace system calls a process makes?

Use strace -p PID to trace system calls in real time, or strace -f -o trace.log command to trace a command with children.

What’s Next

Backup Strategies

Monitoring & Logging

Systemd Service Management

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro. Updated 2026-06-20.

Previous File Permissions Advanced — chmod, chown, ACLs, setuid, Sticky Bit, umask Next Backup Strategies — rsync, tar, dd, Automated Backup Scripts

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Linux Administration