[xml] Represent space and tab in XML tag

How to represent space and tab in XML tag. Is there any special characters for them to represent.

This question is related to xml tags

The answer is


I think you could use an actual space or tab directly in XML document, but if you are looking for special characters to represent them so that text processors can't mess them up, then it's:

space =  
tab   = 	

You cannot have spaces and tabs in the tag (i.e., name) of an XML elements, see the specs: http://www.w3.org/TR/REC-xml/#NT-STag. Beside alphanumeric characters, colon, underscore, dash and dot characters are allowed in a name, and the first letter cannot be a dash or a dot. Certain unicode characters are also permitted, without actually double-checking, I'd say that these are international letters.


Illegal XML Tag Name Characters can be encoded using Unicode UCS-2. This works very nicely. I am using it to create XML that gets turned into json (JPath is weak compared to XPath). Notice the handling of spaces, (, ) characters. Unicode UCS-2 Code Chart: http://www.columbia.edu/kermit/ucs2.html

        tag.Name = tag.Name.Replace(" ", "_x005F_x0020_");
        tag.Name = tag.Name.Replace("(", "_x005F_x0028_");
        tag.Name = tag.Name.Replace(")", "_x005F_x0029_");

XML:

  <Internal_x005F_x0020_Chargeback_x005F_x0020_ID>{CHARGEBACKCODE}</Internal_x005F_x0020_Chargeback_x005F_x0020_ID>
  <Bill_x005F_x0020_To>{CHARGEBACKCODE}</Bill_x005F_x0020_To>
  <Operator_x005F_x0020_or_x005F_x0020_Directly_x005F_x0020_Responsible_x005F_x0020_Individual_x005F_x0020__x005F_x0028_DRI_x005F_x0029_>[email protected]</Operator_x005F_x0020_or_x005F_x0020_Directly_x005F_x0020_Responsible_x005F_x0020_Individual_x005F_x0020__x005F_x0028_DRI_x005F_x0029_>

transformed to json via json.net:

    "Internal Chargeback ID": "{CHARGEBACKCODE}",
    "Bill To": "{CHARGEBACKCODE}",
    "Operator or Directly Responsible Individual (DRI)": "[email protected]",

For me, to make it work I need to encode hex value of space within CDATA xml element, so that post parsing it adds up just as in the htm webgae & when viewed in browser just displays a space!. ( all above ideas & answers are useful )

<my-xml-element><![CDATA[&#x20;]]></my-xml-element>

If you are talking about the issue where multiple and non-space whitespace characters are stripped specifically from attribute values, then yes, encoding them as character references such as &#9; will fix it.


New, expanded answer to an old, commonly asked question...

Whitespace in XML Component Names

Summary: Whitespace characters are not permitted in XML element or attribute names.

Here are the main Unicode code points related to whitespace:

  • #x0009 CHARACTER TABULATION
  • #x0020 SPACE
  • #x000A LINE FEED (LF)
  • #x000D CARRIAGE RETURN (CR)
  • #x00A0 NO-BREAK SPACE
  • [#x2002-#x200A] EN SPACE through HAIR SPACE
  • #x205F MEDIUM MATHEMATICAL SPACE
  • #x3000 IDEOGRAPHIC SPACE

None of these code points are permitted by the W3C XML BNF for XML names:

NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] |
                  [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] |
                  [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] |
                  [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] |
                  [#x10000-#xEFFFF]
NameChar      ::= NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] |
                  [#x203F-#x2040]
Name          ::= NameStartChar (NameChar)*

Whitespace in XML Content (Not Component Names)

Summary: Whitespace characters are, of course, permitted in XML content.

All of the above whitespace codepoints are permitted in XML content by the W3C XML BNF for Char:

Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
/* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */

Unicode code points can be inserted as character references. Both decimal &#decimal; and hexadecimal &#xhex; forms are supported.


Work for me

\n = &#xA;
\r = &#xD;
\t = &#x9;
space = &#x20;

Here is an example on how to use them in XML

<KeyWord name="hello&#x9;" />

I had the same issue and none of the above answers solved the problem, so I tried something very straight-forward: I just putted in my strings.xml \n\t

The complete String looks like this <string name="premium_features_listing_3">- Automatische Aktualisierung der\n\tDatenbank</string>

Results in:

  • Automatische Aktualisierung der

    Datenbank

(with no extra line in between)

Maybe it will help others. Regards