Category Archives: XML-Xpath

Advanced XPATH – Nested predicates

Extensible Markup Language ( XML ) is a standard way to store and share information , describing data and to focus on what data is.

Xpath is a language that makes possible find information in an XML document. This language has a huge importance in the W3C’s XSLT standard and is the base of Xquery and Xpointer. With Xpath we can navigate through elements and it’s attributes. The wide use of XML and the importance of Xpath turns it in a fundamental knowledge in the ITs world.

In this post I’ll answer a friend that has an advanced question. However, I will try to give some basic explanations for those who don’t want to study XML and Xpath from the beginning.

If you want to learn Xpath I recommend the Xpath tutorial from wschools.

Let’s start showing the data that I will use to answer. The information shown bellow is only for the example.

<?xml version=”1.0″ encoding=”UTF-8″?>
<store>
<product>
<identification>
<full_name>Laptop</full_name>
<model>XGSB-2</model>
</identification>
<price>1000</price>
<available value=”yes” />
</product>

<product>
<identification>
<full_name>Desktop</full_name>
<model>DRHD-2</model>
</identification>
<price>500</price>
<available value=”yes” />
</product>

<product>
<identification>
<full_name>Desktop</full_name>
<model>DRHD-3</model>
</identification>
<price>600</price>
<available value=”no” />
</product>
</store>

The information shown below is a XML content that has some elements structured like a tree of nodes. In XML there are many nodes, like elements, attributes, text, and others.

Now is time to think in my friend’s question. He wants to get all identification information from the products that are available.

After see the XML content let’s check how to navigate through it and get the information that we want. Xpath has some expressions to select nodes lile:

Expression Description
Nodename Selects all child nodes of the named node
/ Selects from the root node
// Selects nodes from the current node that match the selection no matter where they are
. Selects the current node
.. Selects the parent of the current node
@ Selects attributes

With the expressions above we can build the path to the identification information that we want.

/store/product/identification

This expression can be built in other ways and returns all product identification nodes.

Now we use one predicate to select only the products that are available, this information is in an attribute of the available element. The basic use of predicates is apply a condition to an element or an element that is a sibling like the current one, e.g. an element is the same level of the current. The doubt is that we want the attribute of an element that isn’t in the direct navigation path and we will to use the elements name and check it’s attribute with another predicate.

We can put a predicate insider another one? Yes. We will use it on this example.

The predicate applies to an element or a parent element or a sibling element. Its position in the query affects the expression in the predicate. In this case, we must be aware that the element in the predicates expression shouldn’t be a full path because using one, we check if there is some available element, inclusive from another product, with its value attribute set to true.

Then, using tow predicates we have the following query: /store/product[available[@value=”true”]]/identification

Xpath have functions that can be use to solve some problems. In this one, the use of a predicate inside another solves it without have to use functions.

To check this example working I used PHP and its DOM extension to work with XML. A very easy way of navigate and find information in a XML document. Check the source code below:

<?
echo “Starting checking available products <br />”;

$xml = new DOMDocument();
$xml->load(‘products.xml’);

$xpath = new DOMXPath($xml);
$products = $xpath->query(‘/store/product[available[@value=”yes”]]/identification’);

for ($i = 0; $i < $products->length; $i++)
{
if ($products->item($i) && !$products->item($i)->hasChildNodes()) continue;

$childs = $products->item($i)->childNodes;

echo “<br />Full name: “.$childs->item(1)->nodeValue.”<br />”;
echo “Model: “.$childs->item(3)->nodeValue.”<br />”;
}
?>