Perl Programming Language Guide — Text Processing and System Administration
Perl is a highly expressive scripting language designed for text processing and system administration — combining the power of shell scripting with C-like control structures and the most advanced built-in regex engine of any language.
What You’ll Learn
- Perl syntax and the concept of context (scalar vs list)
- Built-in regular expressions and string processing
- One-liners and command-line usage
- CPAN — the Comprehensive Perl Archive Network
- File processing and CGI scripting
Why It Matters
Perl is the Swiss Army chainsaw of text processing — no language handles pattern matching and report generation faster or more concisely. Durga Antivirus Pro uses Perl for log analysis and threat pattern extraction where complex regex matching is essential. The Comprehensive Perl Archive Network (CPAN) is one of the largest library repositories ever created, with over 200,000 modules. Despite newer languages gaining mindshare, Perl remains irreplaceable in bioinformatics, legacy enterprise systems, network administration, and one-liner data munging.
Learning Path
flowchart LR
A[Perl Basics & Context<br/>You are here] --> B[Regular Expressions]
B --> C[File Processing & One-Liners]
C --> D[CPAN & Modules]
D --> E[Real-World Scripting]
Your First Perl Program
#!/usr/bin/perl
use strict;
use warnings;
print "Hello, Perl!\n";
my $name = "World";
print "Hello, $name!\n";
# Perl interpolates variables in strings
my $count = 42;
print "The answer is $count\n";perl hello.pl
# Hello, Perl!
# Hello, World!
# The answer is 42Scalar vs List Context
Perl’s most distinctive feature: the same operation behaves differently depending on whether you expect a single value (scalar context) or multiple values (list context).
use strict;
use warnings;
# Scalar context — returns a single value
my $count = localtime(); # scalar context: returns date string
print "Scalar: $count\n"; # e.g., "Sat Jun 7 12:00:00 2026"
# List context — returns multiple values
my @time = localtime(); # list context: returns 9-element array
print "List: @time\n"; # e.g., "0 0 12 7 5 126 0 158 0"
# File read in different contexts
open my $fh, '<', 'data.txt' or die $!;
my $line = <$fh>; # scalar context: reads ONE line
my @lines = <$fh>; # list context: reads ALL remaining lines
close $fh;
# Conditional context
my @arr = (10, 20, 30, 40, 50);
my $size = @arr; # scalar context: array length = 5
print "Size: $size\n"; # 5
my $last = pop @arr; # pop returns scalar
print "Last: $last\n"; # 50
# Comma operator differs by context
my $s = (1, 2, 3); # scalar: returns last value = 3
my @l = (1, 2, 3); # list: returns all values = (1, 2, 3)
print "Scalar comma: $s\n"; # 3Scalar: Sat Jun 7 12:00:00 2026
List: 0 0 12 7 5 126 0 158 0
Size: 5
Last: 50
Scalar comma: 3Regular Expressions
Perl has the most powerful built-in regex engine of any mainstream language.
use strict;
use warnings;
my $text = "The quick brown fox jumps over the lazy dog.";
# Simple match: m//
if ($text =~ /quick/) {
print "Found 'quick'\n"; # Found 'quick'
}
# Capture groups: ()
if ($text =~ /(brown|red) (fox|dog)/) {
print "Match: $1 $2\n"; # Match: brown fox
}
# Substitution: s///
my $modified = $text;
$modified =~ s/dog/cat/;
print "Substitution: $modified\n"; # The quick brown fox jumps over the lazy cat.
# Global match: /g
my @words = $text =~ /(\w+)/g;
print "Words: @words\n"; # The quick brown fox jumps over the lazy dog
# Character classes
my $ip = "192.168.1.1";
if ($ip =~ /^(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})$/) {
print "Valid IP: $1.$2.$3.$4\n";
}
# Non-greedy match
my $html = "<b>Bold</b> and <i>italic</i>";
my ($greedy) = $html =~ /<b>(.*)<\/b>/; # greedy
my ($nongreedy) = $html =~ /<b>(.*?)<\/b>/; # non-greedy
print "Greedy: '$greedy'\n"; # 'Bold</b> and <i>italic</i>'
print "Non-greedy: '$nongreedy'\n"; # 'Bold'Found 'quick'
Match: brown fox
Substitution: The quick brown fox jumps over the lazy cat.
Words: The quick brown fox jumps over the lazy dog
Valid IP: 192.168.1.1
Greedy: 'Bold</b> and <i>italic</i>'
Non-greedy: 'Bold'Perl One-Liners
Perl’s -e flag executes code directly from the command line — ideal for quick text processing.
# Find all lines containing "error" in a log file
perl -ne 'print if /error/' app.log
# Replace "foo" with "bar" in-place
perl -i -pe 's/foo/bar/g' data.txt
# Print first 10 lines of a file
perl -pe 'exit if $. > 10' data.txt
# Count lines, words, characters (like wc)
perl -ple '$c += length; $w += split; END { print "$. lines, $w words, $c chars" }' data.txt
# Extract email addresses from a file
perl -ne 'print "$1\n" while /([\w\.\-]+@[\w\.\-]+\.\w+)/g' contacts.txt
# Sum a column of numbers
perl -lane '$sum += $F[1]; END { print "Sum: $sum" }' data.tsv
# Convert CSV to tab-separated
perl -pe 's/,/\t/g' data.csv > data.tsv
# Print unique lines (like sort -u)
perl -ne 'print unless $seen{$_}++' data.txt
# Remove duplicate lines, keeping order
perl -ne 'print if !$seen{$_}++' data.txt# Example: extract IPs from Apache log
perl -ne 'print "$1\n" while /(\d+\.\d+\.\d+\.\d+)/g' access.log
# 192.168.1.1
# 10.0.0.5
# 203.0.113.42File Processing
use strict;
use warnings;
# Read file line by line
print "--- Reading file ---\n";
open my $fh, '<', 'data.txt' or die "Cannot open: $!";
while (my $line = <$fh>) {
chomp $line; # remove trailing newline
print "Line $.: $line\n"; # $. = current line number
}
close $fh;
# Write to a file
print "--- Writing file ---\n";
open my $out, '>', 'output.txt' or die "Cannot write: $!";
print $out "Line 1\n";
print $out "Line 2\n";
print $out "Line 3\n";
close $out;
# Read entire file into scalar
local $/; # enable slurp mode
open my $fh2, '<', 'data.txt' or die $!;
my $content = <$fh2>;
close $fh2;
print "Content length: " . length($content) . "\n";
# Process CSV (no module)
print "--- CSV Processing ---\n";
my $csv = "name,age,city\nAlice,30,New York\nBob,25,London\n";
my @lines = split /\n/, $csv;
my @headers = split /,/, shift @lines;
for my $line (@lines) {
my @fields = split /,/, $line;
for my $i (0..$#headers) {
print "$headers[$i]: $fields[$i]\n";
}
print "---\n";
}CPAN — The Comprehensive Perl Archive Network
# Install modules
cpan install JSON::XS
cpan install DBI
cpan install LWP::Simple
cpan install Moose
# Using cpanm (more modern)
cpanm Mojo::Web
cpanm Dancer2
cpanm Text::CSV_XSuse strict;
use warnings;
# CPAN module example: JSON
use JSON::XS;
my $json = encode_json({name => "Alice", age => 30, skills => ["Perl", "Python"]});
print "$json\n";
# {"age":30,"name":"Alice","skills":["Perl","Python"]}
my $decoded = decode_json($json);
print "Name: $decoded->{name}\n";
# Name: Alice
# CPAN module: LWP (web download)
use LWP::Simple;
my $content = get("https://example.com") or die "Failed";
print "Downloaded " . length($content) . " bytes\n";
# CPAN module: DBI (database)
use DBI;
my $dbh = DBI->connect("dbi:SQLite:dbname=test.db", "", "") or die $DBI::errstr;
$dbh->do("CREATE TABLE IF NOT EXISTS users (id INTEGER PRIMARY KEY, name TEXT)");
$dbh->do("INSERT INTO users (name) VALUES ('Alice')");
my $users = $dbh->selectall_arrayref("SELECT * FROM users");
for my $u (@$users) {
print "User: $u->[1]\n";
}
$dbh->disconnect();Common Mistakes
1. Forgetting use strict; use warnings;
Without strict, typos create silent global variables. Without warnings, subtle bugs go unnoticed. Always start Perl scripts with these.
2. Confusing scalar @array with list @array
my $size = @array; # size (number of elements)
my @copy = @array; # list copy
my $elem = $array[0]; # single element (notice $ not @)3. Not checking open return values
open my $fh, '<', $file; # silent failure!
open my $fh, '<', $file or die "Cannot open $file: $!";Always check open with or die.
4. Forgetting chomp after reading lines
<FH> includes the newline. chomp removes it. Without chomp, concatenated strings have unwanted newlines.
5. Using == instead of eq for string comparison
"hello" == "world" # numeric comparison: both become 0, so TRUE!
"hello" eq "world" # string comparison: FALSE (correct)6. Modifying $_ in loops without realizing it
Many Perl constructs implicitly use $_. A s/// inside a while (<FH>) modifies $_, which is the current line. Be explicit: while (my $line = <FH>).
7. Not escaping special characters in regex
. matches any character. * is a quantifier. Match them literally with \. and \*.
Practice Questions
What is “context” in Perl? Operations behave differently based on whether you assign to a scalar (
$x = ...) or a list (@x = ...).localtime()returns a string in scalar context, a list in list context.What is the difference between
==andeq?==compares numerically (both operands converted to numbers).eqcompares as strings."42" == "42.0"is true;"42" eq "42.0"is false.What does
chompdo? Removes the trailing newline ($/) from a string. Unlikechop(which removes the last character unconditionally),chomponly removes the record separator.What is CPAN? The Comprehensive Perl Archive Network — a repository of over 200,000 Perl modules. Install modules with
cpanorcpanm. Think of it as Perl’s equivalent of Python’s PyPI.What is the difference between
my,our, andlocal?mydeclares a lexically scoped variable.ourdeclares a package variable with lexical scope.localtemporarily assigns a new value to a global variable, restoring the original at scope exit.
Challenge: Write a Perl script that reads an Apache access log, extracts all unique IP addresses, counts requests per IP, identifies the top 10 IPs by request count, and prints a formatted report sorted by frequency.
Mini Project — Log Analyzer
#!/usr/bin/perl
use strict;
use warnings;
# Apache log line parser
# LogFormat: "%h %l %u %t \"%r\" %>s %b" (Common Log Format)
my %ip_count;
my %status_count;
my %path_count;
my $total = 0;
while (my $line = <>) {
chomp $line;
$total++;
# Parse Apache common log format
if ($line =~ /^(\S+)\s+\S+\s+\S+\s+\[([^\]]+)\]\s+"([^"]+)"\s+(\d+)\s+(\S+)/) {
my ($ip, $date, $request, $status, $bytes) = ($1, $2, $3, $4, $5);
my ($method, $path, $protocol) = split / /, $request;
$ip_count{$ip}++;
$status_count{$status}++;
$path_count{$path}++;
}
}
print "\n" . "=" x 50 . "\n";
print "LOG ANALYSIS REPORT\n";
print "=" x 50 . "\n";
print "Total requests: $total\n\n";
# Top 10 IPs
print "--- Top 10 IPs ---\n";
my @top_ips = sort { $ip_count{$b} <=> $ip_count{$a} } keys %ip_count;
for my $i (0..9) {
last unless defined $top_ips[$i];
printf "%3d. %-15s %5d requests\n", $i+1, $top_ips[$i], $ip_count{$top_ips[$i]};
}
# Status code breakdown
print "\n--- Status Codes ---\n";
for my $status (sort keys %status_count) {
my $pct = sprintf "%.1f", $status_count{$status} / $total * 100;
printf " %s: %5d (%4s%%)\n", $status, $status_count{$status}, $pct;
}
# Top 10 paths
print "\n--- Top 10 Paths ---\n";
my @top_paths = sort { $path_count{$b} <=> $path_count{$a} } keys %path_count;
for my $i (0..9) {
last unless defined $top_paths[$i];
printf "%3d. %-30s %5d hits\n", $i+1, $top_paths[$i], $path_count{$top_paths[$i]};
}
print "\n" . "=" x 50 . "\n";perl log_analyzer.pl access.log
# ==================================================
# LOG ANALYSIS REPORT
# ==================================================
# Total requests: 15230
#
# --- Top 10 IPs ---
# 1. 192.168.1.100 4521 requests
# 2. 10.0.0.25 2890 requests
# 3. 203.0.113.42 1567 requests
# ...
#
# --- Status Codes ---
# 200: 14230 (93.4%)
# 404: 520 ( 3.4%)
# 500: 360 ( 2.4%)
# 301: 120 ( 0.8%)
#
# --- Top 10 Paths ---
# 1. /index.html 3421 hits
# 2. /api/status 2100 hits
# 3. /login 1560 hits
# ...FAQ
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro