[xml] What Content-Type value should I send for my XML sitemap?

I thought I should send "text/xml", but then I read that I should send "application/xml". Does it matter? Can someone explain the difference?

This question is related to xml mime-types sitemap xml-sitemap

The answer is


text/xml is for documents that would be meaningful to a human if presented as text without further processing, application/xml is for everything else

Every XML entity is suitable for use with the application/xml media type without modification. But this does not exploit the fact that XML can be treated as plain text in many cases. MIME user agents (and web user agents) that do not have explicit support for application/xml will treat it as application/octet-stream, for example, by offering to save it to a file.

To indicate that an XML entity should be treated as plain text by default, use the text/xml media type. This restricts the encoding used in the XML entity to those that are compatible with the requirements for text media types as described in [RFC-2045] and [RFC-2046], e.g., UTF-8, but not UTF-16 (except for HTTP).

http://www.ietf.org/rfc/rfc2376.txt


Other answers here address the general question of what the proper Content-Type for an XML response is, and conclude (as with What's the difference between text/xml vs application/xml for webservice response) that both text/xml and application/xml are permissible. However, none address whether there are any rules specific to sitemaps.

Answer: there aren't. The sitemap spec is https://www.sitemaps.org, and using Google site: searches you can confirm that it does not contain the words or phrases mime, mimetype, content-type, application/xml, or text/xml anywhere. In other words, it is entirely silent on the topic of what Content-Type should be used for serving sitemaps.

In the absence of any commentary in the sitemap spec directly addressing this question, we can safely assume that the same rules apply as when choosing the Content-Type of any other XML document - i.e. that it may be either text/xml or application/xml.


both are fine.

text/xxx means that in case the program does not understand xxx it makes sense to show the file to the user as plain text. application/xxx means that it is pointless to show it.

Please note that those content-types were originally defined for E-Mail attachment before they got later used in Web world.


As a rule of thumb, the safest bet towards making your document be treated properly by all web servers, proxies, and client browsers, is probably the following:

  1. Use the application/xml content type
  2. Include a character encoding in the content type, probably UTF-8
  3. Include a matching character encoding in the encoding attribute of the XML document itself.

In terms of the RFC 3023 spec, which some browsers fail to implement properly, the major difference in the content types is in how clients are supposed to treat the character encoding, as follows:

For application/xml, application/xml-dtd, application/xml-external-parsed-entity, or any one of the subtypes of application/xml such as application/atom+xml, application/rss+xml or application/rdf+xml, the character encoding is determined in this order:

  1. the encoding given in the charset parameter of the Content-Type HTTP header
  2. the encoding given in the encoding attribute of the XML declaration within the document,
  3. utf-8.

For text/xml, text/xml-external-parsed-entity, or a subtype like text/foo+xml, the encoding attribute of the XML declaration within the document is ignored, and the character encoding is:

  1. the encoding given in the charset parameter of the Content-Type HTTP header, or
  2. us-ascii.

Most parsers don't implement the spec; they ignore the HTTP Context-Type and just use the encoding in the document. With so many ill-formed documents out there, that's unlikely to change any time soon.