I faced a similar issue where XML was containing control characters. After looking into the code, I found that a deprecated class,StringBufferInputStream, was used for reading string content.
http://docs.oracle.com/javase/7/docs/api/java/io/StringBufferInputStream.html
This class does not properly convert characters into bytes. As of JDK 1.1, the preferred way to create a stream from a string is via the StringReader class.
I changed it to ByteArrayInputStream and it worked fine.