XPath Functions: Complete Reference with Examples
XPath functions transform, filter, and compute values directly within your XPath expressions — eliminating the need for post-processing in languages like Python or JavaScript. This reference covers every essential XPath 2.0/3.0 function with runnable examples.
Learning Path
flowchart LR
A["XPath Basics<br/>Path Expressions"] --> B["XPath Functions<br/>Complete Reference"]
B --> C["XML Validation<br/>DTD & XSD"]
C --> D["SOAP APIs<br/>XML Web Services"]
style B fill:#f90,color:#fff,stroke-width:2px
Sample XML Document
We’ll use this catalog XML for all examples:
<catalog>
<book id="b001" category="fiction">
<title>The Hobbit</title>
<author>J.R.R. Tolkien</author>
<price currency="USD">12.99</price>
<rating>4.8</rating>
<pubdate>1937-09-21</pubdate>
</book>
<book id="b002" category="fiction">
<title>The Fellowship of the Ring</title>
<author>J.R.R. Tolkien</author>
<price currency="USD">14.99</price>
<rating>4.9</rating>
<pubdate>1954-07-29</pubdate>
</book>
<book id="b003" category="non-fiction">
<title>A Brief History of Time</title>
<author>Stephen Hawking</author>
<price currency="GBP">9.99</price>
<rating>4.5</rating>
<pubdate>1988-03-01</pubdate>
</book>
</catalog>String Functions
concat()
Combines two or more strings:
concat(//book[1]/title, ' by ', //book[1]/author)Result: "The Hobbit by J.R.R. Tolkien"
substring()
Extracts a portion of a string:
substring(//book[1]/title, 5, 5)
substring(//book[1]/@id, 2)Result: "obbit", "001"
string-length()
Returns character count:
string-length(//book[1]/title)Result: 10
contains()
Checks if a string contains a substring:
//book[contains(title, 'Hobbit')]/title
//book[contains(author, 'Hawking')]/authorResult: "The Hobbit", "Stephen Hawking"
starts-with() and ends-with()
//book[starts-with(title, 'The')]/title
//book[ends-with(@id, '003')]/titleResult: All three titles, "A Brief History of Time"
normalize-space()
Removes leading/trailing whitespace and collapses internal whitespace:
normalize-space(' Extra spaces ')Result: "Extra spaces"
string-join()
Joins a sequence of strings with a separator:
string-join(//book/author, ', ')Result: "J.R.R. Tolkien, J.R.R. Tolkien, Stephen Hawking"
translate()
Character-by-character replacement:
translate(//book[1]/title, 'obB', 'OBb')Result: "The HOBbit" (o→O, b→B, B→b)
import xml.etree.ElementTree as ET
def run_xpath_functions_demo():
"""Demonstrate XPath string functions using Python."""
xml_str = """<catalog>
<book id="b001"><title>The Hobbit</title><author>J.R.R. Tolkien</author></book>
<book id="b002"><title>The Fellowship of the Ring</title><author>J.R.R. Tolkien</author></book>
</catalog>"""
root = ET.fromstring(xml_str)
# Simulate XPath functions
titles = [b.find('title').text for b in root.findall('book')]
authors = [b.find('author').text for b in root.findall('book')]
print(f"concat: {titles[0]} by {authors[0]}")
print(f"substring(,5,5): {titles[0][4:9]}")
print(f"contains('Hobbit'): {'Hobbit' in titles[0]}")
print(f"string-join: {', '.join(authors)}")
print(f"string-length: {len(titles[0])}")
print(f"translate: {titles[0].translate(str.maketrans({'o':'O','b':'B'}))}")
run_xpath_functions_demo()Expected output:
concat: The Hobbit by J.R.R. Tolkien
substring(,5,5): Hobbi
contains('Hobbit'): True
string-join: J.R.R. Tolkien, J.R.R. Tolkien
string-length: 10
translate: The HOBbitNumber Functions
number()
Converts a string to a number:
number(//book[1]/price) * 2Result: 25.98
ceiling(), floor(), round()
ceiling(12.99)
floor(12.99)
round(12.49)
round(12.50)Result: 13, 12, 12, 13
sum(), avg(), min(), max()
sum(//book/price)
avg(//book/price)
min(//book/price)
max(//book/price)Result: 37.97, 12.656..., 9.99, 14.99
count()
count(//book)
count(//book[author = 'J.R.R. Tolkien'])Result: 3, 2
abs()
XPath 3.0 absolute value:
abs(-42)
abs(//book[1]/price - //book[2]/price)Result: 42, 2.0
def number_functions_demo():
"""Demonstrate XPath number functions."""
prices = [12.99, 14.99, 9.99]
# Simulate XPath number functions
print(f"sum: {sum(prices)}")
print(f"avg: {sum(prices)/len(prices):.4f}")
print(f"min: {min(prices)}")
print(f"max: {max(prices)}")
print(f"count: {len(prices)}")
print(f"ceiling(12.99): {__import__('math').ceil(12.99)}")
print(f"floor(12.99): {__import__('math').floor(12.99)}")
print(f"round(12.49): {round(12.49)}")
print(f"round(12.50): {round(12.50)}")
number_functions_demo()Expected output:
sum: 37.97
avg: 12.6567
min: 9.99
max: 14.99
count: 3
ceiling(12.99): 13
floor(12.99): 12
round(12.49): 12
round(12.50): 13Date Functions (XPath 2.0+)
current-date(), current-time(), current-dateTime()
current-date()
current-time()
current-dateTime()Result: 2026-06-20, 10:30:00+00:00, 2026-06-20T10:30:00+00:00
year-from-date(), month-from-date(), day-from-date()
year-from-date(//book[1]/pubdate)
month-from-date(//book[1]/pubdate)
day-from-date(//book[1]/pubdate)Result: 1937, 9, 21
format-date()
Custom date formatting:
format-date(//book[1]/pubdate, '[D] [MNn], [Y]')
format-date(//book[1]/pubdate, '[Y0001]-[M01]-[D01]')Result: "21 September, 1937", "1937-09-21"
days-from-duration()
Difference between dates:
days-from-duration(current-date() - xs:date(//book[1]/pubdate))Result: 32464 (days between 1937-09-21 and 2026-06-20)
from datetime import date, datetime
def date_functions_demo():
"""Demonstrate XPath date functions."""
today = date(2026, 6, 20)
pubdate = date(1937, 9, 21)
print(f"current-date: {today}")
print(f"year-from-date: {pubdate.year}")
print(f"month-from-date: {pubdate.month}")
print(f"day-from-date: {pubdate.day}")
print(f"format-date([D] [MNn], [Y]): {pubdate.strftime('%d %B, %Y')}")
print(f"days-difference: {(today - pubdate).days}")
date_functions_demo()Expected output:
current-date: 2026-06-20
year-from-date: 1937
month-from-date: 9
day-from-date: 21
format-date([D] [MNn], [Y]): 21 September, 1937
days-difference: 32464Boolean Functions
not()
//book[not(rating > 4.8)]/title
//book[not(@category = 'fiction')]/titleResult: "The Hobbit", "A Brief History of Time", "A Brief History of Time"
boolean()
boolean(//book[1]/title)
boolean(//book[100])Result: true, false
true() and false()
//book[rating > 4.5 and true()]
//book[false()]Result: First two books, empty sequence
Sequence Functions
distinct-values()
distinct-values(//book/author)Result: ("J.R.R. Tolkien", "Stephen Hawking")
subsequence()
subsequence(//book, 2, 2)Result: Books 2 and 3
insert-before(), remove(), reverse()
reverse(//book/@id)Result: ("b003", "b002", "b001")
index-of()
index-of(//book/author, 'J.R.R. Tolkien')Result: (1, 2)
empty() and exists()
empty(//book[rating > 5.0])
exists(//book[rating > 4.0])Result: true, true
Complete Python XPath Runner
from lxml import etree
def run_xpath(xml_content, xpath_expr):
"""Evaluate an XPath expression and print the result."""
root = etree.fromstring(xml_content)
result = root.xpath(xpath_expr)
print(f"XPath: {xpath_expr}")
print(f"Result: {result}")
print()
xml = """<catalog>
<book id="b001" category="fiction">
<title>The Hobbit</title>
<author>J.R.R. Tolkien</author>
<price>12.99</price>
<rating>4.8</rating>
</book>
<book id="b002" category="fiction">
<title>The Fellowship of the Ring</title>
<author>J.R.R. Tolkien</author>
<price>14.99</price>
<rating>4.9</rating>
</book>
<book id="b003" category="non-fiction">
<title>A Brief History of Time</title>
<author>Stephen Hawking</author>
<price>9.99</price>
<rating>4.5</rating>
</book>
</catalog>"""
# Test all function categories
run_xpath(xml, "string-join(//book/title, ' | ')")
run_xpath(xml, "sum(//book/price)")
run_xpath(xml, "round(avg(//book/rating) * 10) div 10")
run_xpath(xml, "count(//book[@category='fiction'])")
run_xpath(xml, "distinct-values(//book/author)")
run_xpath(xml, "//book[rating = max(//book/rating)]/title")
run_xpath(xml, "//book[starts-with(author, 'J.')]/title")Expected output:
XPath: string-join(//book/title, ' | ')
Result: The Hobbit | The Fellowship of the Ring | A Brief History of Time
XPath: sum(//book/price)
Result: 37.97
XPath: round(avg(//book/rating) * 10) div 10
Result: 4.7
XPath: count(//book[@category='fiction'])
Result: 2
XPath: distinct-values(//book/author)
Result: ['J.R.R. Tolkien', 'Stephen Hawking']
XPath: //book[rating = max(//book/rating)]/title
Result: ['The Fellowship of the Ring']
XPath: //book[starts-with(author, 'J.')]/title
Result: ['The Hobbit', 'The Fellowship of the Ring']Common XPath Function Errors
- Wrong argument types — Functions like
sum()expect numeric sequences. Passing strings returns NaN. Usenumber()to convert first. - Forgetting predicates —
//book/priceselects ALL price elements. Use//book[1]/pricefor the first, or//book[@category='fiction']/priceto filter. - Path with no matches —
//book[100]/titlereturns empty sequence, not an error. Always check withexists()before accessing. - Date format mismatches —
format-date()requires specific format patterns ([Y],[M],[D]). Using Python-style%Y-%m-%dpatterns will fail. - XPath 1.0 vs 2.0 function availability — Functions like
current-date()andformat-date()are XPath 2.0+. Using them in an XPath 1.0 processor raises errors. - Case sensitivity —
startswith()(notstartsWith()). XPath functions are case-sensitive and lowercase. - Namespace issues in function queries — If your XML uses namespaces, prefix elements in the path:
//ns:book[ns:rating > 4.5]. Without the namespace prefix, the path returns nothing.
Practice Questions
1. Which XPath function removes extra whitespace?
normalize-space() — it removes leading/trailing whitespace and collapses internal whitespace sequences to single spaces.
2. How do you find the most expensive book using XPath functions?
//book[price = max(//book/price)] — this selects the book whose price equals the maximum price across all books.
3. What’s the difference between concat() and string-join()?
concat() joins individual string arguments. string-join() joins a SEQUENCE of strings with a separator — much more useful when working with node sets.
4. How do you format a date as “15-Mar-2026”?
format-date(xs:date('2026-03-15'), '[D01]-[MN,3]-[Y]'). The [MN,3] format requests a 3-letter month abbreviation.
5. Challenge: XPath expression builder Write a single XPath expression that returns the title of the cheapest book by J.R.R. Tolkien. Then modify it to return the average price of non-fiction books published after 1950.
Mini Project: XPath Query Tool
def xpath_query_tool():
"""Interactive XPath query tool."""
from lxml import etree
xml_content = """<catalog>
<book cat="fiction"><title>The Hobbit</title><price>12.99</price></book>
<book cat="fiction"><title>1984</title><price>10.99</price></book>
<book cat="non-fiction"><title>Sapiens</title><price>15.99</price></book>
</catalog>"""
root = etree.fromstring(xml_content)
queries = [
"count(//book)",
"string-join(//book/title, ', ')",
"//book[price > 12]/title",
"sum(//book/price)",
"//book[not(title = '1984')]/title",
"distinct-values(//book/@cat)",
]
for q in queries:
try:
result = root.xpath(q)
print(f"{'✓' if result else '∅'} {q}")
print(f" → {result}")
except Exception as e:
print(f"✗ {q}")
print(f" → Error: {e}")
print()
xpath_query_tool()FAQ
Related Tutorials
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro. Updated 2026-06-20.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro