[c#] Why does C# XmlDocument.LoadXml(string) fail when an XML header is included?

Does anyone have any idea why the following code sample fails with an XmlException "Data at the root level is invalid. Line 1, position 1."

var body = "<?xml version="1.0" encoding="utf-16"?><Report> ......"
XmlDocument bodyDoc = new XmlDocument();            
bodyDoc.LoadXml(body);

This question is related to c# .net xml

The answer is


I figured it out. Read the MSDN documentation and it says to use .Load instead of LoadXml when reading from strings. Found out this works 100% of time. Oddly enough using StringReader causes problems. I think the main reason is that this is a Unicode encoded string and that could cause problems because StringReader is UTF-8 only.

MemoryStream stream = new MemoryStream();
            byte[] data = body.PayloadEncoding.GetBytes(body.Payload);
            stream.Write(data, 0, data.Length);
            stream.Seek(0, SeekOrigin.Begin);

            XmlTextReader reader = new XmlTextReader(stream);

            // MSDN reccomends we use Load instead of LoadXml when using in memory XML payloads
            bodyDoc.Load(reader);

This worked for me:

var xdoc = new XmlDocument { XmlResolver = null };  
xdoc.LoadXml(xmlFragment);

This really saved my day.

I have written a extension method based on Zach's answer, also I have extended it to use the encoding as a parameter, allowing for different encodings beside from UTF-8 to be used, and I wrapped the MemoryStream in a 'using' statement.

public static class XmlHelperExtentions
{
    /// <summary>
    /// Loads a string through .Load() instead of .LoadXml()
    /// This prevents character encoding problems.
    /// </summary>
    /// <param name="xmlDocument"></param>
    /// <param name="xmlString"></param>
    public static void LoadString(this XmlDocument xmlDocument, string xmlString, Encoding encoding = null) {

        if (encoding == null) {
            encoding = Encoding.UTF8;
        }

        // Encode the XML string in a byte array
        byte[] encodedString = encoding.GetBytes(xmlString);

        // Put the byte array into a stream and rewind it to the beginning
        using (var ms = new MemoryStream(encodedString)) {
            ms.Flush();
            ms.Position = 0;

            // Build the XmlDocument from the MemorySteam of UTF-8 encoded bytes
            xmlDocument.Load(ms);
        }
    }
}

Simple and effective solution: Instead of using the LoadXml() method use the Load() method

For example:

XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load("sample.xml");

I had the same issue because the XML file I was uploading was encoded using UTF-8-BOM (UTF-8 byte-order mark).

Switched the encoding to UTF-8 in Notepad++ and was able to load the XML file in code.


Simple and effective solution: Instead of using the LoadXml() method use the Load() method

For example:

XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load("sample.xml");

Background

Although your question does have the encoding set as UTF-16, you don't have the string properly escaped so I wasn't sure if you did, in fact, accurately transpose the string into your question.

I ran into the same exception:

System.Xml.XmlException: Data at the root level is invalid. Line 1, position 1.

However, my code looked like this:

string xml = "<?xml version=\"1.0\" encoding=\"utf-8\" ?>\n<event>This is a Test</event>";
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xml);

The Problem

The problem is that strings are stored internally as UTF-16 in .NET however the encoding specified in the XML document header may be different. E.g.:

<?xml version="1.0" encoding="utf-8"?>

From the MSDN documentation for String here:

Each Unicode character in a string is defined by a Unicode scalar value, also called a Unicode code point or the ordinal (numeric) value of the Unicode character. Each code point is encoded using UTF-16 encoding, and the numeric value of each element of the encoding is represented by a Char object.

This means that when you pass XmlDocument.LoadXml() your string with an XML header, it must say the encoding is UTF-16. Otherwise, the actual underlying encoding won't match the encoding reported in the header and will result in an XmlException being thrown.

The Solution

The solution for this problem is to make sure the encoding used in whatever you pass the Load or LoadXml method matches what you say it is in the XML header. In my example above, either change your XML header to state UTF-16 or to encode the input in UTF-8 and use one of the XmlDocument.Load methods.

Below is sample code demonstrating how to use a MemoryStream to build an XmlDocument using a string which defines a UTF-8 encode XML document (but of course, is stored a UTF-16 .NET string).

string xml = "<?xml version=\"1.0\" encoding=\"utf-8\" ?>\n<event>This is a Test</event>";

// Encode the XML string in a UTF-8 byte array
byte[] encodedString = Encoding.UTF8.GetBytes(xml);

// Put the byte array into a stream and rewind it to the beginning
MemoryStream ms = new MemoryStream(encodedString);
ms.Flush();
ms.Position = 0;

// Build the XmlDocument from the MemorySteam of UTF-8 encoded bytes
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(ms);

I figured it out. Read the MSDN documentation and it says to use .Load instead of LoadXml when reading from strings. Found out this works 100% of time. Oddly enough using StringReader causes problems. I think the main reason is that this is a Unicode encoded string and that could cause problems because StringReader is UTF-8 only.

MemoryStream stream = new MemoryStream();
            byte[] data = body.PayloadEncoding.GetBytes(body.Payload);
            stream.Write(data, 0, data.Length);
            stream.Seek(0, SeekOrigin.Begin);

            XmlTextReader reader = new XmlTextReader(stream);

            // MSDN reccomends we use Load instead of LoadXml when using in memory XML payloads
            bodyDoc.Load(reader);

Background

Although your question does have the encoding set as UTF-16, you don't have the string properly escaped so I wasn't sure if you did, in fact, accurately transpose the string into your question.

I ran into the same exception:

System.Xml.XmlException: Data at the root level is invalid. Line 1, position 1.

However, my code looked like this:

string xml = "<?xml version=\"1.0\" encoding=\"utf-8\" ?>\n<event>This is a Test</event>";
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xml);

The Problem

The problem is that strings are stored internally as UTF-16 in .NET however the encoding specified in the XML document header may be different. E.g.:

<?xml version="1.0" encoding="utf-8"?>

From the MSDN documentation for String here:

Each Unicode character in a string is defined by a Unicode scalar value, also called a Unicode code point or the ordinal (numeric) value of the Unicode character. Each code point is encoded using UTF-16 encoding, and the numeric value of each element of the encoding is represented by a Char object.

This means that when you pass XmlDocument.LoadXml() your string with an XML header, it must say the encoding is UTF-16. Otherwise, the actual underlying encoding won't match the encoding reported in the header and will result in an XmlException being thrown.

The Solution

The solution for this problem is to make sure the encoding used in whatever you pass the Load or LoadXml method matches what you say it is in the XML header. In my example above, either change your XML header to state UTF-16 or to encode the input in UTF-8 and use one of the XmlDocument.Load methods.

Below is sample code demonstrating how to use a MemoryStream to build an XmlDocument using a string which defines a UTF-8 encode XML document (but of course, is stored a UTF-16 .NET string).

string xml = "<?xml version=\"1.0\" encoding=\"utf-8\" ?>\n<event>This is a Test</event>";

// Encode the XML string in a UTF-8 byte array
byte[] encodedString = Encoding.UTF8.GetBytes(xml);

// Put the byte array into a stream and rewind it to the beginning
MemoryStream ms = new MemoryStream(encodedString);
ms.Flush();
ms.Position = 0;

// Build the XmlDocument from the MemorySteam of UTF-8 encoded bytes
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(ms);

Try this:

XmlDocument bodyDoc = new XmlDocument();
bodyDoc.XMLResolver = null;
bodyDoc.Load(body);

I had the same problem when switching from absolute to relative path for my xml file. The following solves both loading and using relative source path issues. Using a XmlDataProvider, which is defined in xaml (should be possible in code too) :

    <Window.Resources>
    <XmlDataProvider 
        x:Name="myDP"
        x:Key="MyData"
        Source=""
        XPath="/RootElement/Element"
        IsAsynchronous="False"
        IsInitialLoadEnabled="True"                         
        debug:PresentationTraceSources.TraceLevel="High"  /> </Window.Resources>

The data provider automatically loads the document once the source is set. Here's the code :

        m_DataProvider = this.FindResource("MyData") as XmlDataProvider;
        FileInfo file = new FileInfo("MyXmlFile.xml");

        m_DataProvider.Document = new XmlDocument();
        m_DataProvider.Source = new Uri(file.FullName);

Simple line:

bodyDoc.LoadXml(new MemoryStream(Encoding.Unicode.GetBytes(body)));


This really saved my day.

I have written a extension method based on Zach's answer, also I have extended it to use the encoding as a parameter, allowing for different encodings beside from UTF-8 to be used, and I wrapped the MemoryStream in a 'using' statement.

public static class XmlHelperExtentions
{
    /// <summary>
    /// Loads a string through .Load() instead of .LoadXml()
    /// This prevents character encoding problems.
    /// </summary>
    /// <param name="xmlDocument"></param>
    /// <param name="xmlString"></param>
    public static void LoadString(this XmlDocument xmlDocument, string xmlString, Encoding encoding = null) {

        if (encoding == null) {
            encoding = Encoding.UTF8;
        }

        // Encode the XML string in a byte array
        byte[] encodedString = encoding.GetBytes(xmlString);

        // Put the byte array into a stream and rewind it to the beginning
        using (var ms = new MemoryStream(encodedString)) {
            ms.Flush();
            ms.Position = 0;

            // Build the XmlDocument from the MemorySteam of UTF-8 encoded bytes
            xmlDocument.Load(ms);
        }
    }
}

I figured it out. Read the MSDN documentation and it says to use .Load instead of LoadXml when reading from strings. Found out this works 100% of time. Oddly enough using StringReader causes problems. I think the main reason is that this is a Unicode encoded string and that could cause problems because StringReader is UTF-8 only.

MemoryStream stream = new MemoryStream();
            byte[] data = body.PayloadEncoding.GetBytes(body.Payload);
            stream.Write(data, 0, data.Length);
            stream.Seek(0, SeekOrigin.Begin);

            XmlTextReader reader = new XmlTextReader(stream);

            // MSDN reccomends we use Load instead of LoadXml when using in memory XML payloads
            bodyDoc.Load(reader);

I had the same issue because the XML file I was uploading was encoded using UTF-8-BOM (UTF-8 byte-order mark).

Switched the encoding to UTF-8 in Notepad++ and was able to load the XML file in code.


I figured it out. Read the MSDN documentation and it says to use .Load instead of LoadXml when reading from strings. Found out this works 100% of time. Oddly enough using StringReader causes problems. I think the main reason is that this is a Unicode encoded string and that could cause problems because StringReader is UTF-8 only.

MemoryStream stream = new MemoryStream();
            byte[] data = body.PayloadEncoding.GetBytes(body.Payload);
            stream.Write(data, 0, data.Length);
            stream.Seek(0, SeekOrigin.Begin);

            XmlTextReader reader = new XmlTextReader(stream);

            // MSDN reccomends we use Load instead of LoadXml when using in memory XML payloads
            bodyDoc.Load(reader);

Try this:

XmlDocument bodyDoc = new XmlDocument();
bodyDoc.XMLResolver = null;
bodyDoc.Load(body);

I had the same problem when switching from absolute to relative path for my xml file. The following solves both loading and using relative source path issues. Using a XmlDataProvider, which is defined in xaml (should be possible in code too) :

    <Window.Resources>
    <XmlDataProvider 
        x:Name="myDP"
        x:Key="MyData"
        Source=""
        XPath="/RootElement/Element"
        IsAsynchronous="False"
        IsInitialLoadEnabled="True"                         
        debug:PresentationTraceSources.TraceLevel="High"  /> </Window.Resources>

The data provider automatically loads the document once the source is set. Here's the code :

        m_DataProvider = this.FindResource("MyData") as XmlDataProvider;
        FileInfo file = new FileInfo("MyXmlFile.xml");

        m_DataProvider.Document = new XmlDocument();
        m_DataProvider.Source = new Uri(file.FullName);

This worked for me:

var xdoc = new XmlDocument { XmlResolver = null };  
xdoc.LoadXml(xmlFragment);

Simple line:

bodyDoc.LoadXml(new MemoryStream(Encoding.Unicode.GetBytes(body)));


Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to .net

You must add a reference to assembly 'netstandard, Version=2.0.0.0 How to use Bootstrap 4 in ASP.NET Core No authenticationScheme was specified, and there was no DefaultChallengeScheme found with default authentification and custom authorization .net Core 2.0 - Package was restored using .NetFramework 4.6.1 instead of target framework .netCore 2.0. The package may not be fully compatible Update .NET web service to use TLS 1.2 EF Core add-migration Build Failed What is the difference between .NET Core and .NET Standard Class Library project types? Visual Studio 2017 - Could not load file or assembly 'System.Runtime, Version=4.1.0.0' or one of its dependencies Nuget connection attempt failed "Unable to load the service index for source" Token based authentication in Web API without any user interface

Examples related to xml

strange error in my Animation Drawable How do I POST XML data to a webservice with Postman? PHP XML Extension: Not installed How to add a Hint in spinner in XML Generating Request/Response XML from a WSDL Manifest Merger failed with multiple errors in Android Studio How to set menu to Toolbar in Android How to add colored border on cardview? Android: ScrollView vs NestedScrollView WARNING: Exception encountered during context initialization - cancelling refresh attempt