sort and uniq Commands in Linux — Sort & Deduplicate Data
The sort and uniq commands organize and deduplicate text data in Linux — from sorting log files to counting unique visitors in access logs. Combined in pipelines, they form one of the most powerful data-processing toolchains in the shell.
What You’ll Learn
You’ll master sorting by columns, numeric values, and months; extracting unique lines; counting duplicate occurrences; and combining sort with uniq to summarize structured data like logs and CSVs.
Why sort and uniq Matter
Unstructured data is noise — sorted data is signal. System administrators use sort+uniq to find the most frequent error codes, count unique IP addresses, and identify duplicate entries in configuration files. DodaZIP uses sort internally to organize archive manifests, and Durga Antivirus Pro ranks threat signatures by frequency using these same commands.
Learning Path
flowchart LR
A[Essential Commands] --> B[Text Processing Tools]
B --> C[sort & uniq<br/>You are here]
C --> D[cut & tr]
C --> E[awk & sed]
style C fill:#f90,color:#fff
|). These commands work on any Linux distribution.Syntax Overview
sort [options] [file...]
uniq [options] [input_file [output_file]]sort Options Table
| Option | Description |
|---|---|
-n | Numeric sort (123 before 89) |
-r | Reverse (descending) order |
-k | Sort by column / field |
-t | Field separator (default: whitespace) |
-u | Unique lines (same as sort | uniq) |
-M | Sort by month (Jan, Feb, …) |
-h | Human-readable numbers (2K, 3G) |
-s | Stable sort (preserve original order for ties) |
-f | Case-insensitive sort |
uniq Options Table
| Option | Description |
|---|---|
-c | Prefix lines by count of occurrences |
-d | Only print duplicate lines |
-u | Only print unique (non-duplicate) lines |
-i | Case-insensitive comparison |
-w N | Compare at most N characters per line |
Examples
Example 1: Basic sort
$ cat fruits.txt
banana
apple
cherry
date
$ sort fruits.txt
apple
banana
cherry
dateLines are sorted alphabetically (lexicographically) by default.
Example 2: Numeric Sort (-n)
$ cat numbers.txt
10
2
33
1
$ sort numbers.txt
1
10
2
33
$ sort -n numbers.txt
1
2
10
33Without -n, 10 comes before 2 because 1 < 2 alphabetically.
Example 3: Reverse Sort (-r)
$ sort -r numbers.txt
33
2
10
1-r reverses any sort order — useful for top-N lists.
Example 4: Sort by Column (-k)
$ cat employees.txt
Alice 45000
Bob 38000
Carol 52000
$ sort -k2 -n employees.txt
Bob 38000
Alice 45000
Carol 52000Sorts by column 2 (salary) numerically. The -t flag changes the field separator — for CSVs use -t','.
Example 5: Unique Lines with sort -u
$ cat duplicates.txt
alpha
beta
alpha
gamma
beta
$ sort -u duplicates.txt
alpha
beta
gammasort -u sorts and removes duplicates in one pass.
Example 6: Sort by Month (-M)
$ cat months.txt
Mar
Jan
Dec
Apr
Nov
$ sort -M months.txt
Jan
Mar
Apr
Nov
Dec-M understands three-letter month abbreviations.
Example 7: Count Duplicates with uniq -c
$ cat access.log
192.168.1.1
192.168.1.2
192.168.1.1
192.168.1.3
192.168.1.1
$ sort access.log | uniq -c
3 192.168.1.1
1 192.168.1.2
1 192.168.1.3uniq requires sorted input — always pipe through sort first.
Example 8: Only Duplicates (uniq -d)
$ sort access.log | uniq -d
192.168.1.1Only lines that appear more than once are shown.
Example 9: Pipe sort | uniq — Top Error Codes
$ cat /var/log/syslog | grep "ERROR" | awk '{print $NF}' | sort | uniq -c | sort -rn
127 Connection refused
83 Timeout occurred
12 Disk fullThis pipeline extracts the last field from ERROR lines, counts each code, and sorts by frequency descending.
Example 10: Stable Sort (-s)
$ cat data.csv
Alice 45000 Marketing
Bob 38000 Engineering
Carol 52000 Marketing
David 41000 Engineering
$ sort -s -k3 data.csv
Bob 38000 Engineering
David 41000 Engineering
Alice 45000 Marketing
Carol 52000 MarketingStable sort preserves the original input order for lines with the same key — here, alphabetical order within each department is preserved.
Common Use Cases
| Use Case | Command |
|---|---|
| Sort files by size | ls -l | sort -k5 -n |
| Find top 5 IP addresses | sort access.log | uniq -c | sort -rn | head -5 |
| Remove duplicate lines | sort -u file.txt > cleaned.txt |
| Sort by human-readable size | du -sh * | sort -h |
| Count unique logins | last | awk '{print $1}' | sort | uniq -c |
Common Errors
- uniq without sort:
uniqonly compares adjacent lines — if data isn’t sorted, duplicates won’t be detected. - sort -n on non-numeric data: If a field contains non-numeric characters,
sort -ntreats it as 0. - Column sorting with wrong delimiter: Use
-tto set the field separator (e.g.,-t','for CSV). - Case sensitivity: By default both
sortanduniqare case-sensitive. Add-for-ito ignore case. - Large file performance: Sorting files larger than RAM can be slow — use
-Sto specify buffer size.
Practice Exercises
- Basic sort: Create a file with 10 random words and sort them alphabetically.
- Numeric + reverse: Sort a list of numbers from highest to lowest.
- Column sort: Sort a CSV by column 3 (numeric).
- Dedup + count: Count how many times each word appears in a text file.
- Log analysis: Find the top 10 most frequent IP addresses in
access.log.
Challenge
Write a one-liner that reads /var/log/auth.log, extracts failed login usernames, sorts them, counts occurrences, and displays only usernames that failed more than 5 times — sorted by most failures first. Durga Antivirus Pro uses similar patterns to detect brute-force attacks.
grep "Failed password" /var/log/auth.log | awk '{print $9}' | sort | uniq -c | awk '$1 > 5' | sort -rnReal-World Task
Analyze a web server access log to find:
- The busiest hour of the day
- The most requested URL
- The top 5 referrer domains
Use sort, uniq, cut, and head in a pipeline.
What is the sort command?
The sort command arranges lines of text files in a specified order — alphabetical, numeric, by month, or by any column — and outputs the result to stdout.
What is the uniq command?
The uniq command filters adjacent duplicate lines from sorted input, optionally counting them or showing only duplicates.
Related Tutorials
- Essential Linux Commands — text processing tools overview
- cut and tr — extract and transform text jointly with sort+uniq
- Bash Scripting Guide — automate sort+uniq in scripts
- Linux Administration Basics — foundational admin skills
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro