Skip to content
Bash File Compression — Complete Guide to zip, tar, gzip & bzip2

Bash File Compression — Complete Guide to zip, tar, gzip & bzip2

DodaTech Updated Jun 6, 2026 7 min read

Compressing files saves disk space and reduces transfer time. Bash provides several tools for creating and extracting archives — each with different strengths depending on your need for speed, size, or compatibility.

What You’ll Learn

  • Compress and extract files with zip and unzip
  • Create archives with tar and combine with gzip, bzip2, xz
  • Compress single files with gzip and bzip2
  • Choose the right compression tool for each scenario
  • Automate backup compression with scripts

Why File Compression Matters

Disk space costs money. Network bandwidth is limited. Sending a 2GB uncompressed log file over the wire takes minutes; a 200MB compressed version takes seconds. DodaZIP is built around compression — it batch-processes archives, applies password protection, and splits large archives for email. Durga Antivirus Pro compresses quarantine files with gzip to isolate threats while preserving them for analysis.

Learning Path

    flowchart LR
  A[Networking] --> B[File Compression<br/>You are here]
  
Prerequisites: You should know Bash (navigating files, running commands) and Bash. Familiarity with Linux helps.

Compression Tools Overview

Different tools serve different purposes. Think of it like choosing a suitcase:

  • zip: All-in-one — compresses and archives. Universal (Windows, macOS, Linux).
  • tar + gzip: Standard on Linux. Fast compression, good ratios.
  • tar + bzip2: Better compression ratio, slower. For archiving old data.
  • tar + xz: Best compression, slowest. For long-term archival.
  • gzip alone: Compresses single files. Fast. Replaces the original.
    flowchart TD
  A[Compression Tools] --> B[zip/unzip<br/>Universal, cross-platform]
  A --> C[tar + gzip<br/>Standard on Linux]
  A --> D[tar + bzip2<br/>Better ratio, slower]
  A --> E[gzip/gunzip<br/>Single files]
  

zip / unzip — Universal Archives

zip is the most portable format — Windows users can open it without extra software. It combines compression and archiving in one step.

# Create a zip archive from multiple files
zip archive.zip file1.txt file2.txt

# Compress an entire directory (recursive)
zip -r project.zip project/

# Compression levels: 0 (none) to 9 (max), default 6
zip -9 archive.zip large-file.dat

# Password-protected archive
zip -e secure.zip secret.txt
# Enter password: ********

# List contents without extracting
unzip -l archive.zip

# Extract to current directory
unzip archive.zip

# Extract to a specific directory
unzip archive.zip -d /target/dir/

# Extract a single file from the archive
unzip archive.zip file1.txt

tar — The Linux Standard

tar (Tape Archive) combines files into a single archive but does not compress by default. Think of it as packing a suitcasetar puts everything in, then you use gzip to vacuum-seal it.

# Create an uncompressed tar archive
tar cf archive.tar folder/

# Extract a tar archive
tar xf archive.tar

# List contents
tar tf archive.tar

The flag convention: c = create, x = extract, t = list, f = file (must be last before the filename). The flag order matters: tar -czf works; tar -fcz may not.

tar + Compression

# tar + gzip (most common — .tar.gz or .tgz)
tar czf archive.tar.gz folder/          # create
tar xzf archive.tar.gz                  # extract
tar tzf archive.tar.gz                  # list

# tar + bzip2 (better ratio — .tar.bz2)
tar cjf archive.tar.bz2 folder/         # create
tar xjf archive.tar.bz2                 # extract

# Verbose mode — shows files as they're processed
tar czvf archive.tar.gz folder/

Compression Format Comparison

Extensiontar FlagAlgorithmSpeedSize
.tarnoneNoneLargest
.tar.gz / .tgzzgzipFastMedium
.tar.bz2jbzip2SlowSmall
.tar.xzJxzSlowestSmallest

gzip / gunzip — Single File Compression

gzip compresses a single file and replaces it with a .gz version. Unlike zip or tar, it doesn’t preserve the original by default.

# Compress (replaces file.txt with file.txt.gz)
gzip file.txt

# Keep the original file
gzip -k file.txt

# Best compression (levels 1-9, default 6)
gzip -9 file.txt

# Decompress
gunzip file.txt.gz
# or
gzip -d file.txt.gz

# View contents without extracting
zcat file.txt.gz        # prints to stdout
zless file.txt.gz       # scrollable viewer

Why gzip replaces the original: This is the Unix philosophy — tools do one thing. gzip compresses data streams. When given a file, it compresses the content and replaces the file with the compressed version. Use -k if you want to keep the original.

bzip2 / bunzip2

Better compression than gzip (about 20% smaller) but 5-10x slower.

# Compress (replaces file.txt with file.txt.bz2)
bzip2 file.txt

# Keep original
bzip2 -k file.txt

# Decompress
bunzip2 file.txt.bz2
# or
bzip2 -d file.txt.bz2

# View contents
bzcat file.txt.bz2

Common Patterns — Real-World Automation

Timestamped Backups

tar czf "backup-$(date +%Y%m%d).tar.gz" /home/you/project
# Creates: backup-20260606.tar.gz

The $(date +%Y%m%d) inserts today’s date into the filename. Run this daily and you get a dated archive every time.

Batch Extract Multiple Archives

for f in *.tar.gz; do
    tar xzf "$f" && echo "Extracted: $f"
done

Compress Old Log Files

find /var/log -name "*.log" -mtime +7 -exec gzip {} \;

This finds .log files older than 7 days and compresses them with gzip. The -mtime +7 flag means “modified more than 7 days ago.”

Create Split Archives

# Split zip into 100MB chunks for email or upload
zip -r -s 100m large-archive.zip folder/

This creates files like large-archive.zip, large-archive.z01, large-archive.z02, etc.

Common Mistakes

1. Forgetting -r with zip for directories

zip archive.zip folder/    # Empty archive — zip needs -r for directories
zip -r archive.zip folder/ # Correct

2. Confusing tar flag order

The -f flag must be followed immediately by the filename. tar -czf archive.tar.gz is correct. tar -fcz archive.tar.gz fails.

3. Not using compression with tar

tar cf archive.tar big-folder/     # No compression — huge file
tar czf archive.tar.gz big-folder/ # With gzip compression

4. gzip replaces the original

gzip important.txt          # important.txt is GONE
gzip -k important.txt       # Keep original with -k

5. Not verifying archives after creation

tar czf backup.tar.gz data/
# Always check it:
tar tzf backup.tar.gz

6. Using bzip2 when gzip is fast enough

bzip2 compresses ~20% better but is 5-10x slower. For everyday use, gzip is the right balance. Reserve bzip2 for long-term archival of cold data.

Practice Questions

  1. What is the difference between zip and tar.gz? zip combines archiving and compression in one tool. tar archives (combines files) and gzip compresses the result. .tar.gz is more common on Linux; .zip is cross-platform.

  2. How do you extract a .tar.gz file? tar xzf file.tar.gz.

  3. Which compression gives the smallest size? xz (tar cJf) gives the best compression but is slowest. bzip2 is middle. gzip is fastest with decent compression.

  4. How do you compress a directory while keeping the original? gzip -k file.txt keeps the original. Or use zip -r archive.zip folder/ which keeps both.

  5. How do you view archive contents without extracting? unzip -l archive.zip for zip, tar tzf archive.tar.gz for tar.

Challenge: Write a script that creates a timestamped backup of a directory, excludes .log files, compresses with maximum gzip compression, verifies the archive, and deletes backups older than 30 days.

FAQ

What is the difference between zip and tar.gz?
zip is both a compressor and archiver combined into one program. tar only archives (combines files), and gzip compresses the single archive file. .tar.gz is standard on Linux; .zip is universal across all platforms.
How do I extract a .tar.gz file?
tar xzf file.tar.gz. The flags: x = extract, z = gzip decompress, f = filename.
Which compression should I use?
Use gzip for daily work (fast, decent ratio). Use bzip2 or xz for long-term archiving where you want maximum space savings and speed doesn’t matter.
How do I compress a directory excluding certain files?
zip -r archive.zip folder/ -x "*.log" or tar czf archive.tar.gz --exclude='*.log' folder/.
How do I protect an archive with a password?
zip -e secure.zip file.txt prompts for a password. Note: Zip encryption is weak — use GPG for serious security.
How do I compress in a script without overwriting existing backups?
Include a timestamp: tar czf "backup-$(date +%Y%m%d_%H%M%S).tar.gz" folder/.

Try It Yourself

Open your terminal and experiment in /tmp:

# Create practice files
mkdir -p /tmp/compress-practice
cd /tmp/compress-practice
echo "This is a test file" > doc1.txt
echo "Another document" > doc2.txt

# Try different compression methods
zip arch.zip doc1.txt doc2.txt
tar czf arch.tar.gz doc1.txt doc2.txt
gzip -k doc1.txt

# Compare sizes
ls -lh

# Extract what you created
rm doc1.txt doc2.txt
unzip arch.zip
tar xzf arch.tar.gz

# Clean up
cd /tmp && rm -rf /tmp/compress-practice

What’s Next

TutorialWhat You’ll Learn
Bash ReferenceQuick reference for all Bash commands, syntax, and operators
Linux File ManagementAdvanced file system management, partitions, LVM
Python File HandlingProgrammatic file compression and archive management with Python

What’s Next

Congratulations on completing this Bash Compression tutorial! Here’s where to go from here:

  • Practice daily — Consistency is more important than long study sessions
  • Build a project — Apply what you learned by building something real
  • Explore related topics — Check out other tutorials in the same category
  • Join the community — Discuss with other learners and share your progress

Remember: every expert was once a beginner. Keep coding!

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro