Learn Regex for HTML Tags — Pattern Explained with Examples

Q: How do I match only opening tags and not closing tags?

Use the pattern <[^>/]+> — the [^>/] excludes the forward slash to avoid matching </tagname>. However, self-closing tags like <br /> will also be excluded.

Programming Glossary

Regex Pattern Library

Regex for HTML Tags — Pattern Explained with Examples

DodaTech Updated Jun 20, 2026 3 min read

HTML tag matching is heavily used in web scraping, content sanitization, and template processing. While regex is not a full HTML parser, it can effectively match simple opening, closing, and self-closing tags with basic attributes. This pattern is useful for quick extraction and cleanup tasks.

The Pattern

/<[^>]*>/g

For a more detailed pattern including tag names and attributes:

/<\/?[\w\s="'.%-]+>/

Pattern Breakdown

Part	Meaning
`<`	Opening angle bracket
`[^>]*`	Any character except `>` — matches tag name, attributes, and whitespace
`>`	Closing angle bracket
`<\/?`	Optional forward slash for closing tags
`[\w\s="\'.%-]+`	Tag content including word chars, spaces, quotes, and common attribute characters

Matches

<div>
<img src="img.png" />
</p>
<a href="link" class="btn">
<br>

Does NOT Match

<div (missing closing bracket)
<p>unclosed (tag not a complete match — the regex would match <p> but not the full text)
< div> (spaces between < and tag name)
div> (missing opening bracket)
Nested tags properly (regex cannot track nesting depth)

Language Examples

JavaScript

const htmlTagRegex = /<[^>]*>/g;
const html = '<div><p>Hello</p></div>';
console.log(html.match(htmlTagRegex));
// ['<div>', '<p>', '</p>', '</div>']

Python

import re
pattern = r'<[^>]*>'
html = '<div><p>Hello</p></div>'
matches = re.findall(pattern, html)
print(matches)  # ['<div>', '<p>', '</p>', '</div>']

PHP

$html = '<div><p>Hello</p></div>';
preg_match_all('/<[^>]*>/', $html, $matches);
print_r($matches[0]);
// Array ( [0] => <div> [1] => <p> [2] => </p> [3] => </div> )

Common Pitfalls

Regex cannot parse arbitrary HTML — it fails on nested tags of the same type, malformed markup, and complex attribute values containing > characters
Script and style tags contain content with < and > that will be incorrectly matched by simple patterns
HTML comments () have special syntax that requires its own handling to avoid false positives
Attribute values can contain > if quoted, but [^>]* would stop at the first > inside the attribute

Real-World Use Cases

Web scraping — extract all HTML elements from a page for data parsing and content extraction
Content sanitization — strip HTML tags from user input to prevent XSS attacks in rendered output
Template processing — identify and replace custom template tags or directives within HTML markup

FAQ

When should I NOT use regex for HTML?

Never use regex to parse or validate HTML structure. Use a DOM parser (like DOMParser in JS or BeautifulSoup in Python) when you need to extract content, traverse the tree, or handle nested elements.

How do I match only opening tags and not closing tags?

Use the pattern <[^>/]+> — the [^>/] excludes the forward slash to avoid matching </tagname>. However, self-closing tags like <br /> will also be excluded.

Related Patterns

Regex for URL Regex for Email

Previous Regex for MAC Address — Pattern Explained with Examples Next Regex for Credit Card Numbers — Pattern Explained with Examples

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Regex Pattern Library