I am generating an XML document from a StringBuilder, basically something like:
string.Format("<text><row>{0}</row><col>{1}</col><textHeight>{2}</textHeight><textWidth>{3}</textWidth><data>{4}</data><rotation>{5}</rotation></text>
Later, something like:
XmlDocument document = new XmlDocument();
document.LoadXml(xml);
XmlNodeList labelSetNodes = document.GetElementsByTagName("labels");
for (int index = 0; index < labelSetNodes.Count; index++)
{
//do something
}
All the data comes from a database. Recently I've had a few issues with the error:
Hexadecimal value 0x00 is a invalid character, line 1, position nnnnn
But its not consistent. Sometimes some 'blank' data will work. The 'faulty' data works on some PCs, but not others.
In the database, the data is always a blank string. It is never 'null'
and in the XML file, it comes out as < data>< /data>
, i.e. no character between opening and closing. (but not sure if this can be relied on as I am pulling it from the 'immediate' window is vis studio and pasting it into textpad).
There is possibly differences in the versions of sql server (2008 is where it would fail, 2005 would work) and collation too. Not sure if any of these are likely causes?
But exactly the same code and data will sometimes fail. Any ideas where the problem lies?
This question is related to
.net
sql-server
xml
I'm using IronPython here (same as .NET API) and reading the file as UTF-8 in order to properly handle the BOM fixed the problem for me:
xmlFile = Path.Combine(directory_str, 'file.xml')
doc = XPathDocument(XmlTextReader(StreamReader(xmlFile.ToString(), Encoding.UTF8)))
It would work as well with the XmlDocument
:
doc = XmlDocument()
doc.Load(XmlTextReader(StreamReader(xmlFile.ToString(), Encoding.UTF8)))
To add to Sonz's answer above, following worked for us.
//Instead of
XmlString.Replace("�", "[0x00]");
// use this
XmlString.Replace("\x00", "[0x00]");
In my case, it took some digging, but found it.
My Context
I'm looking at exception/error logs from the website using Elmah. Elmah returns the state of the server at the of time the exception, in the form of a large XML document. For our reporting engine I pretty-print the XML with XmlWriter.
During a website attack, I noticed that some xmls weren't parsing and was receiving this '.', hexadecimal value 0x00, is an invalid character.
exception.
NON-RESOLUTION: I converted the document to a byte[]
and sanitized it of 0x00, but it found none.
When I scanned the xml document, I found the following:
...
<form>
...
<item name="SomeField">
<value
string="C:\boot.ini�.htm" />
</item>
...
There was the nul byte encoded as an html entity �
!!!
RESOLUTION: To fix the encoding, I replaced the �
value before loading it into my XmlDocument
, because loading it will create the nul byte and it will be difficult to sanitize it from the object. Here's my entire process:
XmlDocument xml = new XmlDocument();
details.Xml = details.Xml.Replace("�", "[0x00]"); // in my case I want to see it, otherwise just replace with ""
xml.LoadXml(details.Xml);
string formattedXml = null;
// I have this in a helper function, but for this example I have put it in-line
StringBuilder sb = new StringBuilder();
XmlWriterSettings settings = new XmlWriterSettings {
OmitXmlDeclaration = true,
Indent = true,
IndentChars = "\t",
NewLineHandling = NewLineHandling.None,
};
using (XmlWriter writer = XmlWriter.Create(sb, settings)) {
xml.Save(writer);
formattedXml = sb.ToString();
}
LESSON LEARNED: sanitize for illegal bytes using the associated html entity, if your incoming data is html encoded on entry.
As kind of a late answer:
I've had this problem with SSRS ReportService2005.asmx when uploading a report.
Public Shared Sub CreateReport(ByVal strFileNameAndPath As String, ByVal strReportName As String, ByVal strReportingPath As String, Optional ByVal bOverwrite As Boolean = True)
Dim rs As SSRS_2005_Administration_WithFOA = New SSRS_2005_Administration_WithFOA
rs.Credentials = ReportingServiceInterface.GetMyCredentials(strCredentialsURL)
rs.Timeout = ReportingServiceInterface.iTimeout
rs.Url = ReportingServiceInterface.strReportingServiceURL
rs.UnsafeAuthenticatedConnectionSharing = True
Dim btBuffer As Byte() = Nothing
Dim rsWarnings As Warning() = Nothing
Try
Dim fstrStream As System.IO.FileStream = System.IO.File.OpenRead(strFileNameAndPath)
btBuffer = New Byte(fstrStream.Length - 1) {}
fstrStream.Read(btBuffer, 0, CInt(fstrStream.Length))
fstrStream.Close()
Catch ex As System.IO.IOException
Throw New Exception(ex.Message)
End Try
Try
rsWarnings = rs.CreateReport(strReportName, strReportingPath, bOverwrite, btBuffer, Nothing)
If Not (rsWarnings Is Nothing) Then
Dim warning As Warning
For Each warning In rsWarnings
Log(warning.Message)
Next warning
Else
Log("Report: {0} created successfully with no warnings", strReportName)
End If
Catch ex As System.Web.Services.Protocols.SoapException
Log(ex.Detail.InnerXml.ToString())
Catch ex As Exception
Log("Error at creating report. Invalid server name/timeout?" + vbCrLf + vbCrLf + "Error Description: " + vbCrLf + ex.Message)
Console.ReadKey()
System.Environment.Exit(1)
End Try
End Sub ' End Function CreateThisReport
The problem occurs when you allocate a byte array that is at least 1 byte larger than the RDL (XML) file.
Specifically, I used a C# to vb.net converter, that converted
btBuffer = new byte[fstrStream.Length];
into
btBuffer = New Byte(fstrStream.Length) {}
But because in C# the number denotes the NUMBER OF ELEMENTS in the array, and in VB.NET, that number denotes the UPPER BOUND of the array, I had an excess byte, causing this error.
So the problem's solution is simply:
btBuffer = New Byte(fstrStream.Length - 1) {}
I also get the same error in an ASP.NET application when I saved some unicode data (Hindi) in the Web.config file and saved it with "Unicode" encoding.
It fixed the error for me when I saved the Web.config file with "UTF-8" encoding.
Source: Stackoverflow.com