[xml] How to use XPath contains() here?

I'm trying to learn XPath. I looked at the other contains() examples around here, but nothing that uses an AND operator. I can't get this to work:

//ul[@class='featureList' and contains(li, 'Model')]

On:

...
<ul class="featureList">

<li><b>Type:</b> Clip Fan</li><li><b>Feature:</b> Air Moved: 65 ft.
    Amps: 1.1
    Clip: Grips any surface up to 1.63"
    Plug: 3 prong grounded plug on heavy duty model
    Usage: Garage, Workshop, Dorm, Work-out room, Deck, Office & more.</li><li><b>Speed Setting:</b> 2 speeds</li><li><b>Color:</b> Black</li><li><b>Power Consumption:</b> 62 W</li><li><b>Height:</b> 14.5"</li><li><b>Width:</b> Grill Diameter: 9.5"</li><li><b>Length:</b> 11.5"</li>

<li><b>Model #: </b>CR1-0081-06</li>
<li><b>Item #: </b>N82E16896817007</li>
<li><b>Return Policy: </b></li>
</ul>
...

This question is related to xml xpath

The answer is


You are only looking at the first li child in the query you have instead of looking for any li child element that may contain the text, 'Model'. What you need is a query like the following:

//ul[@class='featureList' and ./li[contains(.,'Model')]]

This query will give you the elements that have a class of featureList with one or more li children that contain the text, 'Model'.


Paste my contains example here:

//table[contains(@class, "EC_result")]/tbody

This is a new answer to an old question about a common misconception about contains() in XPath...

Summary: contains() means contains a substring, not contains a node.

Detailed Explanation

This XPath is often misinterpreted:

//ul[contains(li, 'Model')]

Wrong interpretation: Select those ul elements that contain an li element with Model in it.

This is wrong because

  1. contains(x,y) expects x to be a string, and
  2. the XPath rule for converting multiple elements to a string is this:

    A node-set is converted to a string by returning the string-value of the node in the node-set that is first in document order. If the node-set is empty, an empty string is returned.

Right interpretation: Select those ul elements whose first li child has a string-value that contains a Model substring.

Examples

XML

<r>
  <ul id="one">
    <li>Model A</li>
    <li>Foo</li>
  </ul>
  <ul id="two">
    <li>Foo</li>
    <li>Model A</li>
  </ul>
</r> 

XPaths

  • //ul[contains(li, 'Model')] selects the one ul element.

    Note: The two ul element is not selected because the string-value of the first li child of the two ul is Foo, which does not contain the Model substring.

  • //ul[li[contains(.,'Model')]] selects the one and two ul elements.

    Note: Both ul elements are selected because contains() is applied to each li individually. (Thus, the tricky multiple-element-to-string conversion rule is avoided.) Both ul elements do have an li child whose string value contains the Model substring -- position of the li element no longer matters.

See also


//ul[@class="featureList" and li//text()[contains(., "Model")]]

I already gave my +1 to Jeff Yates' solution.

Here is a quick explanation why your approach does not work. This:

//ul[@class='featureList' and contains(li, 'Model')]

encounters a limitation of the contains() function (or any other string function in XPath, for that matter).

The first argument is supposed to be a string. If you feed it a node list (giving it "li" does that), a conversion to string must take place. But this conversion is done for the first node in the list only.

In your case the first node in the list is <li><b>Type:</b> Clip Fan</li> (converted to a string: "Type: Clip Fan") which means that this:

//ul[@class='featureList' and contains(li, 'Type')]

would actually select a node!