Learn Regex for File Extension — Pattern Explained with Examples

Regex for File Extension — Pattern Explained with Examples

DodaTech Updated Jun 20, 2026 3 min read

This regex extracts the file extension from a filename by matching the portion after the last dot. It is commonly used in file upload systems, email attachment handling, and content management systems to validate allowed file types and organize uploaded files.

The Pattern

/\.([a-zA-Z0-9]+)$/

Pattern Breakdown

Part	Meaning
`\.`	Literal dot (the last dot before the extension)
`(`	Start of capture group (extracts the extension)
`[a-zA-Z0-9]+`	One or more alphanumeric characters (the extension name)
`)`	End of capture group
`$`	End of string

Matches

Filename	Matched Extension
`report.pdf`	`.pdf` (captures `pdf`)
`image.jpg`	`.jpg` (captures `jpg`)
`archive.tar.gz`	`.gz` (captures `gz`)
`index.html`	`.html` (captures `html`)
`README.md`	`.md` (captures `md`)
`backup-2024.tar.gz`	`.gz` (captures `gz`)

Does NOT Match

README — no dot, no extension
.bashrc — dot at start, but no proper extension (hidden file)
file. — trailing dot with no characters after it
file.name.with.dots — matches but only captures last extension
/path/to/file — no extension, no dot
image.JPEG — the [a-zA-Z0-9]+ part matches JPEG, but case may need normalization

Language Examples

JavaScript

const extRegex = /\.([a-zA-Z0-9]+)$/;

console.log('report.pdf'.match(extRegex)?.[1]);       // pdf
console.log('archive.tar.gz'.match(extRegex)?.[1]);   // gz (not tar.gz)
console.log('README'.match(extRegex));                 // null

// Lowercase extension for comparison
function getExtension(filename) {
  const match = filename.match(/\.([a-zA-Z0-9]+)$/);
  return match ? match[1].toLowerCase() : null;
}

// Validate allowed types
const ALLOWED = ['pdf', 'jpg', 'png', 'docx'];
function isAllowed(filename) {
  const ext = getExtension(filename);
  return ext && ALLOWED.includes(ext);
}

Python

import re

ext_regex = r'\.([a-zA-Z0-9]+)$'

match = re.search(ext_regex, 'report.pdf')
print(match.group(1) if match else None)  # pdf

match = re.search(ext_regex, 'archive.tar.gz')
print(match.group(1) if match else None)  # gz

match = re.search(ext_regex, 'README')
print(match.group(1) if match else None)  # None

# Lowercase extension for comparison
def get_extension(filename):
    match = re.search(r'\.([a-zA-Z0-9]+)$', filename)
    return match.group(1).lower() if match else None

ALLOWED = {'pdf', 'jpg', 'png', 'docx'}
def is_allowed(filename):
    ext = get_extension(filename)
    return ext in ALLOWED

Common Pitfalls

.tar.gz vs .gz (double extensions) — This regex captures only the last extension after the final dot. For archive.tar.gz, it returns gz, not tar.gz. If you need to detect compound extensions (.tar.gz, .tar.bz2), maintain a list of known multi-part extensions and check against it.
Hidden files (.bashrc, .gitignore) — Files starting with a dot but containing no proper extension match this pattern differently. .bashrc has no dot before the “extension” part — the pattern looks for a dot followed by alphanumeric chars at the end. Actually .bashrc would match because the dot is at position 0 and bashrc is alphanumeric. If you need to exclude hidden files, add a negative lookahead: /^(?!\.).*\.([a-zA-Z0-9]+)$/.
Files without an extension — README contains no dot at all, so the regex returns null. Always handle the no-extension case gracefully in your code.
Case sensitivity — File extensions on most modern file systems are case-insensitive (JPG = jpg), but the regex distinguishes JPG from jpg in capture output. Always normalize to lowercase before comparing against an allowlist.

Real-World Use Cases

File upload validation — Restricting uploads to image types (jpg, png, gif) or document types (pdf, docx)
Email attachment processing — Extracting extensions to determine MIME types and scan for dangerous files
Static site generation — Separating content files (.md) from assets (.css, .js, .png) for build pipeline routing

FAQ

Maintain a set of known compound extensions and check against it before falling back to the last-dot regex. For example: ['.tar.gz', '.tar.bz2', '.tar.xz']. Alternatively, use a regex that matches the full dotted tail: /\.([a-zA-Z0-9]+(\.[a-zA-Z0-9]+)*)$/.

Yes — file extensions can be misleading or spoofed. For security-critical applications (file uploads), always validate the actual MIME type by reading the file header (magic bytes) on the server side, not just the filename extension.