Using XPATH to search text containing nbsp

Question

I use XPather Browser to check my XPATH expressions on an HTML page   My end goal is to use these expressions in Selenium for the testing of my user interfaces   I got an HTML file with a content similar to this     lt tr gt     lt td gt abc lt  td gt     lt td gt  amp nbsp  lt  td gt   lt  tr gt    I want to select a node with a text containing the string   amp nbsp     With a normal string like  abc  there is no problem   I use an XPATH similar to   td text    abc     When I try with an an XPATH like   td text     amp nbsp    it returns nothing   Is there a special rule concerning texts with   amp

User · Answer

Search for  amp nbsp  or only nbsp - did you try this

User · Answer

As per the HTML you have provided    lt tr gt     lt td gt abc lt  td gt     lt td gt  amp nbsp  lt  td gt   lt  tr gt    To locate the node with the string  amp nbsp  you can use either of the following xpath based solutions    Using text        td text     u00A0     Using contains           td contains      u00A0          However  ideally you may like to avoid the NO-BREAK SPACE character and use either of the following Locator Strategies    Using the parent  lt tr gt  node and following-sibling      tr  following-sibling  td 2    Using starts-with        tr  td last      Using the preceeding  lt td gt  node and followingnode andfollowing-sibling         td text    abc    following  td 1        Reference  You can find a relevant detailed discussion in    How to find an element which contains  amp nbsp  using Selenium     tl  dr  Unicode Character  NO-BREAK SPACE   U 00A0

User · Answer

I cannot get a match using Xpather  but the following worked for me with plain XML and XSL files in Microsoft s XML Notepad    lt xsl value-of select  count   td text     amp nbsp        gt    The value returned is 1  which is the correct value in my test case    However  I did have to declare nbsp as an entity within my XML and XSL using the following    lt  DOCTYPE xsl stylesheet    lt  ENTITY nbsp   amp  160   gt    gt    I m not sure if that helps you  but I was able to actually find nbsp using an XPath expression   Edit  My code sample actually contains the characters   amp nbsp   but the JavaScript syntax highlight converts it to the space character  Don t be mislead

User · Answer

It seems that OpenQA  guys behind Selenium  have already addressed this problem   They defined some variables to explicitely match whitespaces   In my case  I need to use an XPATH similar to   td text      nbsp      I reproduced here the text from OpenQA concerning this issue  found here       HTML automatically normalizes   whitespace within elements  ignoring   leading trailing spaces and converting   extra spaces  tabs and newlines into a   single space  When Selenium reads text   out of the page  it attempts to   duplicate this behavior  so you can   ignore all the tabs and newlines in   your HTML and do assertions based on   how the text looks in the browser when   rendered  We do this by replacing all   non-visible whitespace  including the   non-breaking space   amp nbsp    with a   single space  All visible newlines     lt br gt    lt p gt   and  lt pre gt  formatted   new lines  should be preserved       We use the same normalization logic on   the text of HTML Selenese test case   tables  This has a number of   advantages  First  you don t need to   look at the HTML source of the page to   figure out what your assertions should   be    amp nbsp   symbols are invisible   to the end user  and so you shouldn t   have to worry about them when writing   Selenese tests   You don t need to put     amp nbsp   markers in your test case   to assertText on a field that contains     amp nbsp     You may also put extra   newlines and spaces in your Selenese    lt td gt  tags  since we use the same   normalization logic on the test case   as we do on the text  we can ensure   that assertions and the extracted text   will match exactly       This creates a bit of a problem on   those rare occasions when you really   want need to insert extra whitespace   in your test case  For example  you   may need to type text in a field like   this   foo      But if you simply   write  lt td gt foo    lt  td gt  in your   Selenese test case  we ll replace your   extra spaces with just one space       This problem has a simple workaround    We ve defined a variable in Selenese      space   whose value is a single   space  You can use   space  to   insert a space that won t be   automatically trimmed  like this     lt td gt foo  space   space   space  lt  td gt     We ve also included a variable     nbsp   that you can use to insert   a non-breaking space       Note that XPaths do not normalize   whitespace the way we do  If you need   to write an XPath like     div text    hello world   but the   HTML of the link is really    hello amp nbsp world   you ll need to   insert a real   amp nbsp   into your   Selenese test case to get it to match    like this      div text    hello  nbsp world

User · Answer

I found I can make the match when I input a hard-coded non-breaking space  U 00A0  by typing Alt 0160 on Windows between the two quotes       table  id  TableID    td text          worked for me with the special char   From what I understood  the XPath 1 0 standard doesn t handle escaping Unicode chars  There seems to be functions for that in XPath 2 0 but it looks like Firefox doesn t support it  or I misunderstood something   So you have to do with local codepage  Ugly  I know   Actually  it looks like the standard is relying on the programming language using XPath to provide the correct Unicode escape sequence    So  somehow  I did the right thing

User · Answer

Try using the decimal entity  amp  160  instead of the named entity   If that doesn t work  you should be able to simply use the unicode character for a non-breaking space instead of the  amp nbsp  entity      Note   I did not try this in XPather  but I did try it in Oxygen

User · Answer

I cannot get a match using Xpather  but the following worked for me with plain XML and XSL files in Microsoft s XML Notepad    lt xsl value-of select  count   td text     amp nbsp        gt    The value returned is 1  which is the correct value in my test case    However  I did have to declare nbsp as an entity within my XML and XSL using the following    lt  DOCTYPE xsl stylesheet    lt  ENTITY nbsp   amp  160   gt    gt    I m not sure if that helps you  but I was able to actually find nbsp using an XPath expression   Edit  My code sample actually contains the characters   amp nbsp   but the JavaScript syntax highlight converts it to the space character  Don t be mislead

User · Answer

Search for  amp nbsp  or only nbsp - did you try this

User · Answer

Search for  amp nbsp  or only nbsp - did you try this

User · Answer

I cannot get a match using Xpather  but the following worked for me with plain XML and XSL files in Microsoft s XML Notepad    lt xsl value-of select  count   td text     amp nbsp        gt    The value returned is 1  which is the correct value in my test case    However  I did have to declare nbsp as an entity within my XML and XSL using the following    lt  DOCTYPE xsl stylesheet    lt  ENTITY nbsp   amp  160   gt    gt    I m not sure if that helps you  but I was able to actually find nbsp using an XPath expression   Edit  My code sample actually contains the characters   amp nbsp   but the JavaScript syntax highlight converts it to the space character  Don t be mislead

User · Answer

Bear in mind that a standards-compliant XML processor will have replaced any entity references other than XML s five standard ones   amp amp    amp gt    amp lt    amp apos    amp quot   with the corresponding character in the target encoding by the time XPath expressions are evaluated   Given that behavior  PhiLho s and jsulak s suggestions are the way to go if you want to work with XML tools   When you enter  amp  160  in the XPath expression  it should be converted to the corresponding byte sequence before the XPath expression is applied

User · Answer

It seems that OpenQA  guys behind Selenium  have already addressed this problem   They defined some variables to explicitely match whitespaces   In my case  I need to use an XPATH similar to   td text      nbsp      I reproduced here the text from OpenQA concerning this issue  found here       HTML automatically normalizes   whitespace within elements  ignoring   leading trailing spaces and converting   extra spaces  tabs and newlines into a   single space  When Selenium reads text   out of the page  it attempts to   duplicate this behavior  so you can   ignore all the tabs and newlines in   your HTML and do assertions based on   how the text looks in the browser when   rendered  We do this by replacing all   non-visible whitespace  including the   non-breaking space   amp nbsp    with a   single space  All visible newlines     lt br gt    lt p gt   and  lt pre gt  formatted   new lines  should be preserved       We use the same normalization logic on   the text of HTML Selenese test case   tables  This has a number of   advantages  First  you don t need to   look at the HTML source of the page to   figure out what your assertions should   be    amp nbsp   symbols are invisible   to the end user  and so you shouldn t   have to worry about them when writing   Selenese tests   You don t need to put     amp nbsp   markers in your test case   to assertText on a field that contains     amp nbsp     You may also put extra   newlines and spaces in your Selenese    lt td gt  tags  since we use the same   normalization logic on the test case   as we do on the text  we can ensure   that assertions and the extracted text   will match exactly       This creates a bit of a problem on   those rare occasions when you really   want need to insert extra whitespace   in your test case  For example  you   may need to type text in a field like   this   foo      But if you simply   write  lt td gt foo    lt  td gt  in your   Selenese test case  we ll replace your   extra spaces with just one space       This problem has a simple workaround    We ve defined a variable in Selenese      space   whose value is a single   space  You can use   space  to   insert a space that won t be   automatically trimmed  like this     lt td gt foo  space   space   space  lt  td gt     We ve also included a variable     nbsp   that you can use to insert   a non-breaking space       Note that XPaths do not normalize   whitespace the way we do  If you need   to write an XPath like     div text    hello world   but the   HTML of the link is really    hello amp nbsp world   you ll need to   insert a real   amp nbsp   into your   Selenese test case to get it to match    like this      div text    hello  nbsp world

User · Answer

I found I can make the match when I input a hard-coded non-breaking space  U 00A0  by typing Alt 0160 on Windows between the two quotes       table  id  TableID    td text          worked for me with the special char   From what I understood  the XPath 1 0 standard doesn t handle escaping Unicode chars  There seems to be functions for that in XPath 2 0 but it looks like Firefox doesn t support it  or I misunderstood something   So you have to do with local codepage  Ugly  I know   Actually  it looks like the standard is relying on the programming language using XPath to provide the correct Unicode escape sequence    So  somehow  I did the right thing

User · Answer

As per the HTML you have provided    lt tr gt     lt td gt abc lt  td gt     lt td gt  amp nbsp  lt  td gt   lt  tr gt    To locate the node with the string  amp nbsp  you can use either of the following xpath based solutions    Using text        td text     u00A0     Using contains           td contains      u00A0          However  ideally you may like to avoid the NO-BREAK SPACE character and use either of the following Locator Strategies    Using the parent  lt tr gt  node and following-sibling      tr  following-sibling  td 2    Using starts-with        tr  td last      Using the preceeding  lt td gt  node and followingnode andfollowing-sibling         td text    abc    following  td 1        Reference  You can find a relevant detailed discussion in    How to find an element which contains  amp nbsp  using Selenium     tl  dr  Unicode Character  NO-BREAK SPACE   U 00A0

User · Answer

I found I can make the match when I input a hard-coded non-breaking space  U 00A0  by typing Alt 0160 on Windows between the two quotes       table  id  TableID    td text          worked for me with the special char   From what I understood  the XPath 1 0 standard doesn t handle escaping Unicode chars  There seems to be functions for that in XPath 2 0 but it looks like Firefox doesn t support it  or I misunderstood something   So you have to do with local codepage  Ugly  I know   Actually  it looks like the standard is relying on the programming language using XPath to provide the correct Unicode escape sequence    So  somehow  I did the right thing

User · Answer

I found I can make the match when I input a hard-coded non-breaking space  U 00A0  by typing Alt 0160 on Windows between the two quotes       table  id  TableID    td text          worked for me with the special char   From what I understood  the XPath 1 0 standard doesn t handle escaping Unicode chars  There seems to be functions for that in XPath 2 0 but it looks like Firefox doesn t support it  or I misunderstood something   So you have to do with local codepage  Ugly  I know   Actually  it looks like the standard is relying on the programming language using XPath to provide the correct Unicode escape sequence    So  somehow  I did the right thing

User · Answer

Search for  amp nbsp  or only nbsp - did you try this

User · Answer

It seems that OpenQA  guys behind Selenium  have already addressed this problem   They defined some variables to explicitely match whitespaces   In my case  I need to use an XPATH similar to   td text      nbsp      I reproduced here the text from OpenQA concerning this issue  found here       HTML automatically normalizes   whitespace within elements  ignoring   leading trailing spaces and converting   extra spaces  tabs and newlines into a   single space  When Selenium reads text   out of the page  it attempts to   duplicate this behavior  so you can   ignore all the tabs and newlines in   your HTML and do assertions based on   how the text looks in the browser when   rendered  We do this by replacing all   non-visible whitespace  including the   non-breaking space   amp nbsp    with a   single space  All visible newlines     lt br gt    lt p gt   and  lt pre gt  formatted   new lines  should be preserved       We use the same normalization logic on   the text of HTML Selenese test case   tables  This has a number of   advantages  First  you don t need to   look at the HTML source of the page to   figure out what your assertions should   be    amp nbsp   symbols are invisible   to the end user  and so you shouldn t   have to worry about them when writing   Selenese tests   You don t need to put     amp nbsp   markers in your test case   to assertText on a field that contains     amp nbsp     You may also put extra   newlines and spaces in your Selenese    lt td gt  tags  since we use the same   normalization logic on the test case   as we do on the text  we can ensure   that assertions and the extracted text   will match exactly       This creates a bit of a problem on   those rare occasions when you really   want need to insert extra whitespace   in your test case  For example  you   may need to type text in a field like   this   foo      But if you simply   write  lt td gt foo    lt  td gt  in your   Selenese test case  we ll replace your   extra spaces with just one space       This problem has a simple workaround    We ve defined a variable in Selenese      space   whose value is a single   space  You can use   space  to   insert a space that won t be   automatically trimmed  like this     lt td gt foo  space   space   space  lt  td gt     We ve also included a variable     nbsp   that you can use to insert   a non-breaking space       Note that XPaths do not normalize   whitespace the way we do  If you need   to write an XPath like     div text    hello world   but the   HTML of the link is really    hello amp nbsp world   you ll need to   insert a real   amp nbsp   into your   Selenese test case to get it to match    like this      div text    hello  nbsp world

User · Answer

I cannot get a match using Xpather  but the following worked for me with plain XML and XSL files in Microsoft s XML Notepad    lt xsl value-of select  count   td text     amp nbsp        gt    The value returned is 1  which is the correct value in my test case    However  I did have to declare nbsp as an entity within my XML and XSL using the following    lt  DOCTYPE xsl stylesheet    lt  ENTITY nbsp   amp  160   gt    gt    I m not sure if that helps you  but I was able to actually find nbsp using an XPath expression   Edit  My code sample actually contains the characters   amp nbsp   but the JavaScript syntax highlight converts it to the space character  Don t be mislead

User · Answer

It seems that OpenQA  guys behind Selenium  have already addressed this problem   They defined some variables to explicitely match whitespaces   In my case  I need to use an XPATH similar to   td text      nbsp      I reproduced here the text from OpenQA concerning this issue  found here       HTML automatically normalizes   whitespace within elements  ignoring   leading trailing spaces and converting   extra spaces  tabs and newlines into a   single space  When Selenium reads text   out of the page  it attempts to   duplicate this behavior  so you can   ignore all the tabs and newlines in   your HTML and do assertions based on   how the text looks in the browser when   rendered  We do this by replacing all   non-visible whitespace  including the   non-breaking space   amp nbsp    with a   single space  All visible newlines     lt br gt    lt p gt   and  lt pre gt  formatted   new lines  should be preserved       We use the same normalization logic on   the text of HTML Selenese test case   tables  This has a number of   advantages  First  you don t need to   look at the HTML source of the page to   figure out what your assertions should   be    amp nbsp   symbols are invisible   to the end user  and so you shouldn t   have to worry about them when writing   Selenese tests   You don t need to put     amp nbsp   markers in your test case   to assertText on a field that contains     amp nbsp     You may also put extra   newlines and spaces in your Selenese    lt td gt  tags  since we use the same   normalization logic on the test case   as we do on the text  we can ensure   that assertions and the extracted text   will match exactly       This creates a bit of a problem on   those rare occasions when you really   want need to insert extra whitespace   in your test case  For example  you   may need to type text in a field like   this   foo      But if you simply   write  lt td gt foo    lt  td gt  in your   Selenese test case  we ll replace your   extra spaces with just one space       This problem has a simple workaround    We ve defined a variable in Selenese      space   whose value is a single   space  You can use   space  to   insert a space that won t be   automatically trimmed  like this     lt td gt foo  space   space   space  lt  td gt     We ve also included a variable     nbsp   that you can use to insert   a non-breaking space       Note that XPaths do not normalize   whitespace the way we do  If you need   to write an XPath like     div text    hello world   but the   HTML of the link is really    hello amp nbsp world   you ll need to   insert a real   amp nbsp   into your   Selenese test case to get it to match    like this      div text    hello  nbsp world

User · Answer

Bear in mind that a standards-compliant XML processor will have replaced any entity references other than XML s five standard ones   amp amp    amp gt    amp lt    amp apos    amp quot   with the corresponding character in the target encoding by the time XPath expressions are evaluated   Given that behavior  PhiLho s and jsulak s suggestions are the way to go if you want to work with XML tools   When you enter  amp  160  in the XPath expression  it should be converted to the corresponding byte sequence before the XPath expression is applied

User · Answer

Bear in mind that a standards-compliant XML processor will have replaced any entity references other than XML s five standard ones   amp amp    amp gt    amp lt    amp apos    amp quot   with the corresponding character in the target encoding by the time XPath expressions are evaluated   Given that behavior  PhiLho s and jsulak s suggestions are the way to go if you want to work with XML tools   When you enter  amp  160  in the XPath expression  it should be converted to the corresponding byte sequence before the XPath expression is applied

User · Answer

Try using the decimal entity  amp  160  instead of the named entity   If that doesn t work  you should be able to simply use the unicode character for a non-breaking space instead of the  amp nbsp  entity      Note   I did not try this in XPather  but I did try it in Oxygen

User · Answer

Try using the decimal entity  amp  160  instead of the named entity   If that doesn t work  you should be able to simply use the unicode character for a non-breaking space instead of the  amp nbsp  entity      Note   I did not try this in XPather  but I did try it in Oxygen

User · Answer

Bear in mind that a standards-compliant XML processor will have replaced any entity references other than XML s five standard ones   amp amp    amp gt    amp lt    amp apos    amp quot   with the corresponding character in the target encoding by the time XPath expressions are evaluated   Given that behavior  PhiLho s and jsulak s suggestions are the way to go if you want to work with XML tools   When you enter  amp  160  in the XPath expression  it should be converted to the corresponding byte sequence before the XPath expression is applied

User · Answer

Try using the decimal entity  amp  160  instead of the named entity   If that doesn t work  you should be able to simply use the unicode character for a non-breaking space instead of the  amp nbsp  entity      Note   I did not try this in XPather  but I did try it in Oxygen

[xml] Using XPATH to search text containing

Examples related to xml

Examples related to search

Examples related to xpath

Examples related to selenium

[xml] Using XPATH to search text containing &nbsp;

Examples related to xml

Examples related to search

Examples related to xpath

Examples related to selenium

[xml] Using XPATH to search text containing