XML Validation: DTD vs XSD vs RelaxNG
XML validation ensures that an XML document follows the rules defined by a schema — either a DTD (Document Type Definition), XSD (XML Schema Definition), or RelaxNG schema. Each approach has different strengths in expressiveness, portability, and ease of use.
Learning Path
flowchart LR
A["XML Basics<br/>Elements & Attributes"] --> B["XML Validation<br/>DTD vs XSD vs RelaxNG"]
B --> C["XSLT Transformations<br/>Advanced"]
C --> D["XPath Functions<br/>Complete Reference"]
style B fill:#f90,color:#fff,stroke-width:2px
DTD — Document Type Definition
DTD is the oldest and simplest validation language. It’s defined inline within the XML or in a separate .dtd file.
DTD Syntax
<!ELEMENT library (book+)>
<!ELEMENT book (title, author, year, price?)>
<!ATTLIST book category CDATA #REQUIRED isbn ID #IMPLIED>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT price (#PCDATA)>
<!ATTLIST price currency (USD|EUR|GBP) "USD">Key rules:
+= one or more,*= zero or more,?= optional(#PCDATA)= text content#REQUIRED= attribute must be present(USD|EUR|GBP)= enumerated attribute values
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE library SYSTEM "library.dtd">
<library>
<book category="fiction" isbn="b001">
<title>The Hobbit</title>
<author>J.R.R. Tolkien</author>
<year>1937</year>
<price currency="USD">12.99</price>
</book>
</library>DTD Limitations
- No data type support (everything is text)
- No namespace awareness
- Limited cardinality expressions
- DTD syntax is NOT XML (requires separate parser)
XSD — XML Schema Definition
XSD is the W3C standard schema language. It’s XML itself, supports strong typing, namespaces, and complex type hierarchies.
XSD Simple Types
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://doda.tech/library"
xmlns="http://doda.tech/library"
elementFormDefault="qualified">
<!-- Simple type with restriction -->
<xs:simpleType name="ISBNType">
<xs:restriction base="xs:string">
<xs:pattern value="[0-9]{13}|[0-9]{10}"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="YearType">
<xs:restriction base="xs:gYear">
<xs:minInclusive value="1900"/>
<xs:maxInclusive value="2099"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="CurrencyType">
<xs:restriction base="xs:string">
<xs:enumeration value="USD"/>
<xs:enumeration value="EUR"/>
<xs:enumeration value="GBP"/>
</xs:restriction>
</xs:simpleType>
</xs:schema>XSD Complex Types
<xs:complexType name="BookType">
<xs:sequence>
<xs:element name="title" type="xs:string"/>
<xs:element name="author" type="xs:string"/>
<xs:element name="year" type="YearType"/>
<xs:element name="price" minOccurs="0">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:decimal">
<xs:attribute name="currency" type="CurrencyType" default="USD"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="category" type="xs:string" use="required"/>
<xs:attribute name="isbn" type="ISBNType"/>
</xs:complexType>
<xs:complexType name="LibraryType">
<xs:sequence>
<xs:element name="book" type="BookType" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<xs:element name="library" type="LibraryType"/>XSD Features vs DTD
| Feature | DTD | XSD |
|---|---|---|
| Syntax | Non-XML | XML |
| Data types | None | 40+ types (string, int, date, etc.) |
| Namespaces | Not supported | Full support |
| Inheritance | Not supported | extension, restriction |
| Import/Include | ENTITY references | xs:import, xs:include |
| Patterns | Limited | Regular expressions via xs:pattern |
RelaxNG
RelaxNG is simpler than XSD while being more expressive. It uses a compact syntax (.rnc) or an XML syntax (.rng).
RelaxNG Compact Syntax
namespace library = "http://doda.tech/library"
start = element library { book+ }
book = element book {
attribute category { text },
attribute isbn { xsd:string { pattern = "[0-9]{13}|[0-9]{10}" } }?,
element title { text },
element author { text },
element year { xsd:gYear },
element price {
attribute currency { "USD" | "EUR" | "GBP" },
xsd:decimal
}?
}The compact syntax is dramatically cleaner than XSD for the same schema.
RelaxNG XML Syntax (.rng)
<?xml version="1.0" encoding="UTF-8"?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0"
datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
<start>
<element name="library">
<oneOrMore>
<ref name="book"/>
</oneOrMore>
</element>
</start>
<define name="book">
<element name="book">
<attribute name="category"/>
<optional>
<attribute name="isbn">
<data type="string">
<param name="pattern">[0-9]{13}|[0-9]{10}</param>
</data>
</attribute>
</optional>
<element name="title"><text/></element>
<element name="author"><text/></element>
<element name="year"><data type="gYear"/></element>
<optional>
<element name="price">
<attribute name="currency">
<choice>
<value>USD</value>
<value>EUR</value>
<value>GBP</value>
</choice>
</attribute>
<data type="decimal"/>
</element>
</optional>
</element>
</define>
</grammar>When to Use Each
| Schema Type | Best For | Avoid When |
|---|---|---|
| DTD | Simple XML, legacy systems, document-centric XML | You need data types, namespaces, or modular schemas |
| XSD | Enterprise systems, web services (WSDL/SOAP), complex validation | You want simplicity and readability |
| RelaxNG | Open formats (DocBook, TEI), when readability matters | You need W3C compliance, or your tools only support XSD |
Validation with xmllint
xmllint is the standard command-line XML validation tool:
# DTD validation
xmllint --valid --noout library.xml
# XSD validation
xmllint --schema library.xsd --noout library.xml
# RelaxNG validation
xmllint --relaxng library.rng --noout library.xmlimport subprocess
import sys
def validate_xml(xml_file, schema_file, schema_type="xsd"):
"""Validate XML file against a schema."""
cmd = ["xmllint", "--noout"]
if schema_type == "dtd":
cmd.append("--valid")
cmd.append(xml_file)
elif schema_type == "xsd":
cmd.extend(["--schema", schema_file, xml_file])
elif schema_type == "rng":
cmd.extend(["--relaxng", schema_file, xml_file])
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode == 0:
print(f"{xml_file}: VALID ✓")
else:
print(f"{xml_file}: INVALID ✗")
print(result.stderr[:300])
# Test
validate_xml("library.xml", "library.xsd", "xsd")
validate_xml("invalid_book.xml", "library.xsd", "xsd")Expected output:
library.xml: VALID ✓
invalid_book.xml: INVALID ✗
library.xsd:2: element element: Schema parsing error: ...Editor Validation
Most XML editors provide real-time validation:
- VS Code: Install “XML Tools” or “Red Hat XML” extension
- IntelliJ/IDEA: Built-in XML validation with XSD/DTD
- Eclipse: Built-in XML editor with validation
- Oxygen XML Editor: Professional-grade validation with all three schema types
- XMLSpy: Enterprise XML editor with visual schema designer
Common Validation Errors
- DTD not found — The
DOCTYPEdeclaration points to a missing.dtdfile. Use a relative path from the XML file’s location or use an inline DTD for simple cases. - XSD namespace mismatch — The XML uses
xmlns="http://example.com"but the XSD usestargetNamespace="http://other.com". Always verify namespace URIs match exactly. - Element not declared in DTD — Adding a
<summary>element when the DTD only allows<title>,<author>,<year>,<price>. Always update the DTD when the XML structure changes. - xs:dateTime format error — XSD
xs:dateTimerequires ISO 8601 format:2026-06-20T10:30:00Z. Missing timezone or using wrong separators fails validation. - maxOccurs/minOccurs wrong — Setting
maxOccurs="1"when multiple elements are expected. Common mistake in copy-pasted XSDs. - RelaxNG namespace confusion — RelaxNG compact and XML syntaxes use different namespaces. Make sure the XML syntax file uses the RelaxNG namespace, not the XSD namespace.
- Mixed content model — DTD mixed content is declared as
(#PCDATA|child1|child2)*— note the*at the end. Forgetting it makes the model invalid.
Practice Questions
1. What’s the main advantage of XSD over DTD? Data type support — XSD has 40+ built-in types (string, integer, date, decimal) while DTD treats everything as text.
2. When would you choose RelaxNG over XSD? When readability matters, when you need more flexible pattern matching, or when working with document-centric XML formats like DocBook.
3. What does maxOccurs="unbounded" mean in XSD?
The element can appear any number of times (no upper limit).
4. How do you validate XML from the command line?
Use xmllint --schema schema.xsd --noout file.xml or xmllint --valid --noout file.xml for DTD validation.
5. Challenge: Convert a DTD to XSD Take this DTD and write the equivalent XSD:
<!ELEMENT catalog (product+)>
<!ELEMENT product (name, price, description?)>
<!ATTLIST product id ID #REQUIRED category CDATA #REQUIRED>
<!ELEMENT name (#PCDATA)>
<!ELEMENT price (#PCDATA)>
<!ATTLIST price currency CDATA #REQUIRED>
<!ELEMENT description (#PCDATA)>Mini Project: Multi-Format Schema Generator
def generate_dtd():
"""Generate a simple library DTD."""
return """<!ELEMENT library (book+)>
<!ELEMENT book (title, author, year, price?)>
<!ATTLIST book category CDATA #REQUIRED>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT price (#PCDATA)>
<!ATTLIST price currency (USD|EUR|GBP) "USD">
"""
def generate_xsd():
"""Generate a library XSD."""
return """<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="library">
<xs:complexType>
<xs:sequence>
<xs:element name="book" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="title" type="xs:string"/>
<xs:element name="author" type="xs:string"/>
<xs:element name="year" type="xs:gYear"/>
<xs:element name="price" minOccurs="0" type="xs:decimal"/>
</xs:sequence>
<xs:attribute name="category" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
"""
def generate_relaxng():
"""Generate a library RelaxNG schema."""
return """
namespace lib = "http://doda.tech/library"
start = element library {
element book {
attribute category { text },
element title { text },
element author { text },
element year { xsd:gYear },
element price { xsd:decimal }?
}+
}
"""
print("=== DTD ===")
print(generate_dtd())
print("=== XSD ===")
print(generate_xsd()[:200] + "...")
print("=== RelaxNG ===")
print(generate_relaxng())FAQ
Related Tutorials
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro. Updated 2026-06-20.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro