Backup Strategies — rsync, tar, dd, Automated Backup Scripts
Backup strategies are the safety net of production systems. This guide covers the essential Linux backup tools — rsync, tar, dd, dump/restore — and shows you how to build automated, reliable backup pipelines that protect against data loss, corruption, and disaster.
What You’ll Learn
You’ll master incremental backups with rsync, file-level archival with tar, disk cloning with dd and dump, and automated backup scripts with rotation, encryption, and remote storage. You’ll also see how DodaZIP and Durga Antivirus Pro use multi-tier backup strategies across their infrastructure.
Why Backup Strategies Matter
Data loss happens — human error (rm -rf in the wrong directory), software bugs, hardware failures, ransomware attacks. Without backups, recovery is impossible. A proper backup strategy means you can restore a single deleted file in 5 minutes or rebuild an entire server in hours. At DodaTech, all production systems follow the 3-2-1 rule: 3 copies, 2 different media, 1 offsite copy.
Learning Path
flowchart LR
A[Process Management] --> B[Backup Strategies<br/>You are here]
B --> C[Security Hardening]
C --> D[Shell Scripting]
D --> E[Monitoring & Logging]
style B fill:#f90,color:#fff
The 3-2-1 Backup Rule
The gold standard for data protection:
- 3 copies of your data (1 primary + 2 backups)
- 2 different media types (e.g., local disk + cloud storage)
- 1 copy offsite (different physical location)
rsync — Incremental File Sync
Rsync is the workhorse of Linux backups. It transfers only changed parts of files (delta encoding), supports compression, encryption, and remote synchronization.
# Basic local sync
rsync -av /source/dir/ /backup/dir/
# Remote sync (push)
rsync -avz /local/dir/ user@remote:/backup/dir/
# Remote sync (pull)
rsync -avz user@remote:/source/dir/ /local/dir/
# With progress and partial transfer
rsync -avzP /source/dir/ user@remote:/backup/dir/
# Archive mode (preserves everything)
rsync -aAXv /source/ /backup/ # Includes ACLs, xattrsKey rsync Flags
| Flag | Purpose |
|---|---|
-a | Archive — recursive + preserve permissions, timestamps, owner, group |
-v | Verbose |
-z | Compress during transfer |
-P | Progress + partial (resume) |
--delete | Remove files in destination that don’t exist in source |
--exclude | Exclude patterns |
--link-dest | Hardlink-based incremental snapshots |
Hardlink-Based Incremental Backups
This creates daily snapshots where unchanged files are hardlinked (zero extra space):
#!/bin/bash
# daily_snapshot.sh — Hardlink-based daily snapshots
BACKUP_DIR="/backups/server"
DATE=$(date +%Y%m%d)
LATEST=$(ls -1 "$BACKUP_DIR" | tail -1 2>/dev/null)
mkdir -p "$BACKUP_DIR/$DATE"
rsync -aAXv --link-dest="$BACKUP_DIR/$LATEST" \
/source/dir/ "$BACKUP_DIR/$DATE/"
# Cleanup — keep 30 days
find "$BACKUP_DIR" -maxdepth 1 -type d -mtime +30 -exec rm -rf {} \;Expected directory structure:
/backups/server/
├── 20260601/
├── 20260602/ → hardlinks to unchanged files from 20260601
├── 20260603/ → hardlinks to unchanged files from 20260602
└── ...tar — File Archival and Compression
Tar creates single-file archives from directories, optionally with compression.
# Create compressed archive
tar -czf backup.tar.gz /path/to/data # gzip
tar -cjf backup.tar.bz2 /path/to/data # bzip2 (better compression)
tar -cJf backup.tar.xz /path/to/data # xz (best compression)
# Extract archive
tar -xzf backup.tar.gz
tar -xzf backup.tar.gz -C /restore/path # Extract to specific directory
# List contents
tar -tzf backup.tar.gz
tar -tzf backup.tar.gz | grep "config" # Find files matching pattern
# Exclude patterns
tar -czf backup.tar.gz \
--exclude="*.log" \
--exclude="node_modules" \
--exclude="cache" \
/var/www/myapp
# Incremental tar
tar -czg /tmp/snapshot.file \
-f full-backup.tar.gz /var/data # First (full) backup
tar -czg /tmp/snapshot.file \
-f incremental-1.tar.gz /var/data # Incremental backupCompression Comparison
| Format | Command | Speed | Size (1GB data) | Use Case |
|---|---|---|---|---|
| None | tar -cf | Instant | 1.0 GB | Fast archive |
| gzip | tar -czf | Fast | ~350 MB | Daily backups |
| bzip2 | tar -cjf | Medium | ~280 MB | Weekly/monthly |
| xz | tar -cJf | Slow | ~220 MB | Long-term archive |
dd — Disk Cloning and Imaging
DD (data duplicator) performs byte-for-byte copies of disks, partitions, and files. Use it for full disk backups and forensic imaging.
# Clone a partition to an image file
sudo dd if=/dev/sda1 of=/backup/sda1.img bs=4M status=progress
# Clone a disk to another disk
sudo dd if=/dev/sda of=/dev/sdb bs=4M status=progress
# Backup MBR (first 512 bytes)
sudo dd if=/dev/sda of=/backup/mbr.bin bs=512 count=1
# Restore a partition image
sudo dd if=/backup/sda1.img of=/dev/sda1 bs=4M status=progress
# Compress dd output on the fly
sudo dd if=/dev/sda1 bs=4M | gzip > /backup/sda1.img.gz
# Create a sparse file (zeros → no disk space used)
dd if=/dev/zero of=disk.img bs=1M count=100 seek=1000Expected dd output:
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 45.7323 s, 235 MB/sWhen to Use dd
- Full disk recovery — Clone an entire failing drive
- Forensic imaging — Bit-for-bit copies preserve deleted files
- Live USBs — Write ISO images to USB drives:
dd if=ubuntu.iso of=/dev/sdb bs=4M - Swap files — Create swap:
dd if=/dev/zero of=/swapfile bs=1M count=4096
dump and restore — Filesystem-Level Backups
The dump command backs up ext2/ext3/ext4 filesystems at the filesystem level, recording file metadata and inodes.
# Full dump of /dev/sda1 to a file
sudo dump -0uf /backup/rootfs.dump /dev/sda1
# Level 1 incremental (back up files changed since level 0)
sudo dump -1uf /backup/rootfs-inc1.dump /dev/sda1
# Restore from a dump file
sudo restore -rf /backup/rootfs.dump
# Restore a specific file
sudo restore -xf /backup/rootfs.dump /path/to/file
# Interactive restore shell
sudo restore -if /backup/rootfs.dumpDump levels: 0 (full), 1-9 (incremental). Each level backs up files changed since the last lower-numbered dump.
Automated Backup Script
Here’s a production-ready automated backup script with rotation, encryption, and remote upload:
#!/bin/bash
# autobackup.sh — Automated backup with rotation and encryption
# Requires: rsync, tar, gpg, aws CLI (or any remote storage)
BACKUP_NAME="${1:-server}"
BACKUP_DIR="/backups"
SOURCE_DIRS=(
"/etc"
"/var/www"
"/home"
"/opt/app/config"
)
EXCLUDE=(
"--exclude=*.log"
"--exclude=cache"
"--exclude=tmp"
)
REMOTE_PATH="s3://my-bucket/backups/"
RETENTION_DAYS=30
GPG_RECIPIENT="backup@dodatech.com"
DATE=$(date +%Y%m%d_%H%M%S)
TIMESTAMP_FILE="$BACKUP_DIR/$BACKUP_NAME/timestamp.txt"
# Ensure backup directory exists
mkdir -p "$BACKUP_DIR/$BACKUP_NAME"
# Record backup timestamp
echo "Backup started: $(date)" > "$TIMESTAMP_FILE"
# Step 1: Create tar archive
echo "Creating archive..."
tar -czf "$BACKUP_DIR/$BACKUP_NAME/data_$DATE.tar.gz" \
"${EXCLUDE[@]}" "${SOURCE_DIRS[@]}"
if [ $? -ne 0 ]; then
echo "ERROR: Archive creation failed!"
exit 1
fi
# Step 2: Encrypt the archive (optional)
if [ -n "$GPG_RECIPIENT" ]; then
echo "Encrypting archive..."
gpg --encrypt --recipient "$GPG_RECIPIENT" \
--output "$BACKUP_DIR/$BACKUP_NAME/data_$DATE.tar.gz.gpg" \
"$BACKUP_DIR/$BACKUP_NAME/data_$DATE.tar.gz"
rm "$BACKUP_DIR/$BACKUP_NAME/data_$DATE.tar.gz"
BACKUP_FILE="data_$DATE.tar.gz.gpg"
else
BACKUP_FILE="data_$DATE.tar.gz"
fi
# Step 3: Create checksum
echo "Creating checksum..."
sha256sum "$BACKUP_DIR/$BACKUP_NAME/$BACKUP_FILE" > \
"$BACKUP_DIR/$BACKUP_NAME/$BACKUP_FILE.sha256"
# Step 4: Remote copy (if configured)
if [ -n "$REMOTE_PATH" ]; then
echo "Uploading to remote storage..."
# Using rsync to remote
rsync -avz "$BACKUP_DIR/$BACKUP_NAME/" "$REMOTE_PATH"
# Or using AWS CLI
# aws s3 sync "$BACKUP_DIR/$BACKUP_NAME/" "$REMOTE_PATH" --exclude "tmp*"
fi
# Step 5: Rotation — remove backups older than retention period
echo "Rotating old backups..."
find "$BACKUP_DIR/$BACKUP_NAME" -name "data_*.tar.gz*" -mtime +$RETENTION_DAYS -delete
find "$BACKUP_DIR/$BACKUP_NAME" -name "*.sha256" -mtime +$RETENTION_DAYS -delete
# Step 6: Verify last backup
echo "Verifying last backup..."
LATEST=$(ls -t "$BACKUP_DIR/$BACKUP_NAME"/data_*.tar.gz* 2>/dev/null | head -1)
if [ -n "$LATEST" ]; then
sha256sum -c "${LATEST}.sha256" && echo "Verification PASSED" || echo "Verification FAILED!"
fi
echo "Backup completed: $(date)"Expected output:
Creating archive...
Encrypting archive...
Creating checksum...
Uploading to remote storage...
Rotating old backups...
Verifying last backup...
data_20260620_100000.tar.gz.gpg: OK
Verification PASSED
Backup completed: Sat Jun 20 10:00:05 UTC 2026Disaster Recovery Testing
A backup you never test is a backup you don’t have. Create a recovery test plan:
#!/bin/bash
# test_restore.sh — Test restoring from latest backup
# WARNING: Run in an isolated environment, not on production!
BACKUP_DIR="/backups/server"
RESTORE_DIR="/tmp/restore_test"
TEST_FILE="test_verify_$(date +%s).txt"
mkdir -p "$RESTORE_DIR"
# Find latest backup
LATEST=$(ls -t "$BACKUP_DIR"/data_*.tar.gz* 2>/dev/null | head -1)
echo "Testing restore from: $LATEST"
# Decrypt if needed
if [[ "$LATEST" == *.gpg ]]; then
gpg --decrypt --output "$RESTORE_DIR/test_restore.tar.gz" "$LATEST"
else
cp "$LATEST" "$RESTORE_DIR/test_restore.tar.gz"
fi
# Extract
tar -xzf "$RESTORE_DIR/test_restore.tar.gz" -C "$RESTORE_DIR"
echo "Files restored: $(find "$RESTORE_DIR" -type f | wc -l)"
# Check critical files
for file in "/etc/passwd" "/etc/ssh/sshd_config"; do
if [ -f "$RESTORE_DIR/$file" ]; then
echo "OK: $file present"
else
echo "MISSING: $file — backup may be incomplete!"
fi
done
# Cleanup
rm -rf "$RESTORE_DIR"Common Backup Mistakes
1. Not Testing Backups
The most common mistake. A backup job that runs for months but fails to restore is worthless. Schedule monthly restore drills.
2. One Backup Copy Only
If ransomware encrypts your live data and backup is on the same server or network share, both are lost. Always maintain an offline or air-gapped copy.
3. Ignoring Open File Handles
Tar and rsync can miss files that are actively being written. Use filesystem snapshots (LVM, ZFS) or database-aware backup tools for consistent backups.
4. No Monitoring
A backup that silently fails is worse than no backup. Always add monitoring: check exit codes, verify backup sizes, and alert on failures.
5. Infinite Retention
Keeping every backup forever consumes exponential storage. Define a retention policy: daily for 7 days, weekly for 4 weeks, monthly for 12 months.
6. Backing Up Temporary/Cache Data
Including /tmp, browser caches, or node_modules doubles backup time and storage. Exclude regeneratable data with --exclude.
7. No Encryption for Offsite Backups
Offsite backups cross network boundaries. Without encryption, anyone intercepting the data can read it. Use GPG or rsync over SSH with encryption.
Practice Questions
1. What’s the difference between rsync -av and rsync -aAXv?
-a (archive) preserves permissions, timestamps, owner, group, and recurses. Adding -A preserves ACLs and -X preserves extended attributes.
2. How does –link-dest work in rsync? It creates hardlinks to files from a previous backup if they haven’t changed, saving disk space while providing a complete directory tree for each backup.
3. What backup strategy protects against ransomware? The 3-2-1 rule with an offline/immutable copy. Ransomware that encrypts your live data and mounted backups can’t touch an unmounted or air-gapped backup.
4. Why use dd instead of tar for backups? DD performs byte-for-byte copies including empty space, deleted files, and filesystem metadata. Use dd for full disk recovery and forensic imaging. Use tar for file-level archival.
5. Challenge: Design a backup strategy for a PostgreSQL database on a Linux server with the following requirements: point-in-time recovery within 15 minutes, maximum 1 hour of data loss, and 30-day retention. Answer: (1) Continuous WAL archiving with pg_basebackup for full weekly backups. (2) Rsync WAL segments to S3 every 5 minutes. (3) Daily tar of configuration files. (4) Automate with systemd timers. (5) Test recovery monthly using a staging environment.
Mini Project: Multi-Tier Backup System
Create a backup system that handles three tiers — immediate local backup, daily local archive, and weekly offsite:
#!/bin/bash
# multi_tier_backup.sh — Three-tier backup system
# Tier 1: Immediate rsync snapshot (every hour)
# Tier 2: Daily compressed archive (every day)
# Tier 3: Weekly encrypted offsite (every week)
SOURCE="/var/www/myapp"
BASE="/backups"
DATE=$(date +%Y%m%d)
DAY_OF_WEEK=$(date +%u) # 1=Monday, 7=Sunday
# Tier 1 — Hourly snapshot (keep 24)
HOUR_DIR="$BASE/tier1/$DATE/$(date +%H)"
mkdir -p "$HOUR_DIR"
rsync -a --delete "$SOURCE/" "$HOUR_DIR/"
# Tier 2 — Daily archive (keep 30)
if [ ! -f "$BASE/tier2/$DATE.tar.gz" ]; then
tar -czf "$BASE/tier2/$DATE.tar.gz" "$SOURCE"
fi
# Tier 3 — Weekly offsite (keep 12 weeks)
if [ "$DAY_OF_WEEK" = "7" ]; then
WEEK_NUM=$(date +%V)
tar -czf "$BASE/tier3/week_$WEEK_NUM.tar.gz" "$SOURCE"
gpg --encrypt --recipient backup@dodatech.com \
--output "$BASE/tier3/week_$WEEK_NUM.tar.gz.gpg" \
"$BASE/tier3/week_$WEEK_NUM.tar.gz"
rm "$BASE/tier3/week_$WEEK_NUM.tar.gz"
# Upload to offsite
rsync -avz "$BASE/tier3/week_$WEEK_NUM.tar.gz.gpg" \
offsite-backup:/backups/myapp/
fi
# Cleanup old backups
find "$BASE/tier1" -maxdepth 2 -type d -mtime +1 -exec rm -rf {} \;
find "$BASE/tier2" -name "*.tar.gz" -mtime +30 -delete
find "$BASE/tier3" -name "*.tar.gz.gpg" -mtime +84 -delete # 12 weeksRun via cron: */60 * * * * /usr/local/bin/multi_tier_backup.sh
FAQ
What’s Next
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro. Updated 2026-06-20.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro