This is a new answer to an old question about a common misconception about contains()
in XPath...
Summary: contains()
means contains a substring, not contains a node.
This XPath is often misinterpreted:
//ul[contains(li, 'Model')]
Wrong interpretation:
Select those ul
elements that contain an li
element with Model
in it.
This is wrong because
contains(x,y)
expects x
to be a string, andthe XPath rule for converting multiple elements to a string is this:
A node-set is converted to a string by returning the string-value of the node in the node-set that is first in document order. If the node-set is empty, an empty string is returned.
Right interpretation: Select those ul
elements whose first li
child has a string-value that contains a Model
substring.
XML
<r>
<ul id="one">
<li>Model A</li>
<li>Foo</li>
</ul>
<ul id="two">
<li>Foo</li>
<li>Model A</li>
</ul>
</r>
XPaths
//ul[contains(li, 'Model')]
selects the one
ul
element.
Note: The two
ul
element is not selected because the string-value of the first li
child
of the two
ul
is Foo
, which does not contain the Model
substring.
//ul[li[contains(.,'Model')]]
selects the one
and two
ul
elements.
Note: Both ul
elements are selected because contains()
is applied to each li
individually. (Thus, the tricky multiple-element-to-string conversion rule is avoided.) Both ul
elements do have an li
child whose string value contains the Model
substring -- position of the li
element no longer matters.