Regex for File Extension — Pattern Explained with Examples
This regex extracts the file extension from a filename by matching the portion after the last dot. It is commonly used in file upload systems, email attachment handling, and content management systems to validate allowed file types and organize uploaded files.
The Pattern
/\.([a-zA-Z0-9]+)$/Pattern Breakdown
| Part | Meaning |
|---|---|
\. | Literal dot (the last dot before the extension) |
( | Start of capture group (extracts the extension) |
[a-zA-Z0-9]+ | One or more alphanumeric characters (the extension name) |
) | End of capture group |
$ | End of string |
Matches
| Filename | Matched Extension |
|---|---|
report.pdf | .pdf (captures pdf) |
image.jpg | .jpg (captures jpg) |
archive.tar.gz | .gz (captures gz) |
index.html | .html (captures html) |
README.md | .md (captures md) |
backup-2024.tar.gz | .gz (captures gz) |
Does NOT Match
README— no dot, no extension.bashrc— dot at start, but no proper extension (hidden file)file.— trailing dot with no characters after itfile.name.with.dots— matches but only captures last extension/path/to/file— no extension, no dotimage.JPEG— the[a-zA-Z0-9]+part matchesJPEG, but case may need normalization
Language Examples
JavaScript
const extRegex = /\.([a-zA-Z0-9]+)$/;
console.log('report.pdf'.match(extRegex)?.[1]); // pdf
console.log('archive.tar.gz'.match(extRegex)?.[1]); // gz (not tar.gz)
console.log('README'.match(extRegex)); // null
// Lowercase extension for comparison
function getExtension(filename) {
const match = filename.match(/\.([a-zA-Z0-9]+)$/);
return match ? match[1].toLowerCase() : null;
}
// Validate allowed types
const ALLOWED = ['pdf', 'jpg', 'png', 'docx'];
function isAllowed(filename) {
const ext = getExtension(filename);
return ext && ALLOWED.includes(ext);
}Python
import re
ext_regex = r'\.([a-zA-Z0-9]+)$'
match = re.search(ext_regex, 'report.pdf')
print(match.group(1) if match else None) # pdf
match = re.search(ext_regex, 'archive.tar.gz')
print(match.group(1) if match else None) # gz
match = re.search(ext_regex, 'README')
print(match.group(1) if match else None) # None
# Lowercase extension for comparison
def get_extension(filename):
match = re.search(r'\.([a-zA-Z0-9]+)$', filename)
return match.group(1).lower() if match else None
ALLOWED = {'pdf', 'jpg', 'png', 'docx'}
def is_allowed(filename):
ext = get_extension(filename)
return ext in ALLOWEDCommon Pitfalls
.tar.gzvs.gz(double extensions) — This regex captures only the last extension after the final dot. Forarchive.tar.gz, it returnsgz, nottar.gz. If you need to detect compound extensions (.tar.gz,.tar.bz2), maintain a list of known multi-part extensions and check against it.Hidden files (
.bashrc,.gitignore) — Files starting with a dot but containing no proper extension match this pattern differently..bashrchas no dot before the “extension” part — the pattern looks for a dot followed by alphanumeric chars at the end. Actually.bashrcwould match because the dot is at position 0 andbashrcis alphanumeric. If you need to exclude hidden files, add a negative lookahead:/^(?!\.).*\.([a-zA-Z0-9]+)$/.Files without an extension —
READMEcontains no dot at all, so the regex returnsnull. Always handle the no-extension case gracefully in your code.Case sensitivity — File extensions on most modern file systems are case-insensitive (
JPG=jpg), but the regex distinguishesJPGfromjpgin capture output. Always normalize to lowercase before comparing against an allowlist.
Real-World Use Cases
- File upload validation — Restricting uploads to image types (jpg, png, gif) or document types (pdf, docx)
- Email attachment processing — Extracting extensions to determine MIME types and scan for dangerous files
- Static site generation — Separating content files (
.md) from assets (.css,.js,.png) for build pipeline routing
FAQ
Related Patterns
- Regex for Username/Slug
- Regex for Hex Color Codes
- Regex for Password Strength
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro