[javascript] Unescape HTML entities in Javascript?

I have some Javascript code that communicates with an XML-RPC backend. The XML-RPC returns strings of the form:

<img src='myimage.jpg'>

However, when I use the Javascript to insert the strings into HTML, they render literally. I don't see an image, I literally see the string:

<img src='myimage.jpg'>

My guess is that the HTML is being escaped over the XML-RPC channel.

How can I unescape the string in Javascript? I tried the techniques on this page, unsuccessfully: http://paulschreiber.com/blog/2008/09/20/javascript-how-to-unescape-html-entities/

What are other ways to diagnose the issue?

This question is related to javascript html escaping xml-rpc

The answer is


I tried everything to remove & from a JSON array. None of the above examples, but https://stackoverflow.com/users/2030321/chris gave a great solution that led me to fix my problem.

var stringtodecode="<B>Hello</B> world<br>";
document.getElementById("decodeIt").innerHTML=stringtodecode;
stringtodecode=document.getElementById("decodeIt").innerText

I did not use, because I did not understand how to insert it into a modal window that was pulling JSON data into an array, but I did try this based upon the example, and it worked:

var modal = document.getElementById('demodal');
$('#ampersandcontent').text(replaceAll(data[0],"&amp;", "&"));

I like it because it was simple, and it works, but not sure why it's not widely used. Searched hi & low to find a simple solution. I continue to seek understanding of the syntax, and if there is any risk to using this. Have not found anything yet.


a javascript solution that catches the common ones:

var map = {amp: '&', lt: '<', gt: '>', quot: '"', '#039': "'"}
str = str.replace(/&([^;]+);/g, (m, c) => map[c])

this is the reverse of https://stackoverflow.com/a/4835406/2738039


I use this in my project: inspired by other answers but with an extra secure parameter, can be useful when you deal with decorated characters

var decodeEntities=(function(){

    var el=document.createElement('div');
    return function(str, safeEscape){

        if(str && typeof str === 'string'){

            str=str.replace(/\</g, '&lt;');

            el.innerHTML=str;
            if(el.innerText){

                str=el.innerText;
                el.innerText='';
            }
            else if(el.textContent){

                str=el.textContent;
                el.textContent='';
            }

            if(safeEscape)
                str=str.replace(/\</g, '&lt;');
        }
        return str;
    }
})();

And it's usable like:

var label='safe <b> character &eacute;ntity</b>';
var safehtml='<div title="'+decodeEntities(label)+'">'+decodeEntities(label, true)+'</div>';

element.innerText also does the trick.


To unescape HTML entities* in JavaScript you can use small library html-escaper: npm install html-escaper

import {unescape} from 'html-escaper';

unescape('escaped string');

Or unescape function from Lodash or Underscore, if you are using it.


*) please note that these functions don't cover all HTML entities, but only the most common ones, i.e. &, <, >, ', ". To unescape all HTML entities you can use he library.


var htmlEnDeCode = (function() {
    var charToEntityRegex,
        entityToCharRegex,
        charToEntity,
        entityToChar;

    function resetCharacterEntities() {
        charToEntity = {};
        entityToChar = {};
        // add the default set
        addCharacterEntities({
            '&amp;'     :   '&',
            '&gt;'      :   '>',
            '&lt;'      :   '<',
            '&quot;'    :   '"',
            '&#39;'     :   "'"
        });
    }

    function addCharacterEntities(newEntities) {
        var charKeys = [],
            entityKeys = [],
            key, echar;
        for (key in newEntities) {
            echar = newEntities[key];
            entityToChar[key] = echar;
            charToEntity[echar] = key;
            charKeys.push(echar);
            entityKeys.push(key);
        }
        charToEntityRegex = new RegExp('(' + charKeys.join('|') + ')', 'g');
        entityToCharRegex = new RegExp('(' + entityKeys.join('|') + '|&#[0-9]{1,5};' + ')', 'g');
    }

    function htmlEncode(value){
        var htmlEncodeReplaceFn = function(match, capture) {
            return charToEntity[capture];
        };

        return (!value) ? value : String(value).replace(charToEntityRegex, htmlEncodeReplaceFn);
    }

    function htmlDecode(value) {
        var htmlDecodeReplaceFn = function(match, capture) {
            return (capture in entityToChar) ? entityToChar[capture] : String.fromCharCode(parseInt(capture.substr(2), 10));
        };

        return (!value) ? value : String(value).replace(entityToCharRegex, htmlDecodeReplaceFn);
    }

    resetCharacterEntities();

    return {
        htmlEncode: htmlEncode,
        htmlDecode: htmlDecode
    };
})();

This is from ExtJS source code.


CMS' answer works fine, unless the HTML you want to unescape is very long, longer than 65536 chars. Because then in Chrome the inner HTML gets split into many child nodes, each one at most 65536 long, and you need to concatenate them. This function works also for very long strings:

function unencodeHtmlContent(escapedHtml) {
  var elem = document.createElement('div');
  elem.innerHTML = escapedHtml;
  var result = '';
  // Chrome splits innerHTML into many child nodes, each one at most 65536.
  // Whereas FF creates just one single huge child node.
  for (var i = 0; i < elem.childNodes.length; ++i) {
    result = result + elem.childNodes[i].nodeValue;
  }
  return result;
}

See this answer about innerHTML max length for more info: https://stackoverflow.com/a/27545633/694469


For one-line guys:

const htmlDecode = innerHTML => Object.assign(document.createElement('textarea'), {innerHTML}).value;

console.log(htmlDecode('Complicated - Dimitri Vegas &amp; Like Mike'));

You can use Lodash unescape / escape function https://lodash.com/docs/4.17.5#unescape

import unescape from 'lodash/unescape';

const str = unescape('fred, barney, &amp; pebbles');

str will become 'fred, barney, & pebbles'


Closures can avoid creating unnecessary objects.

const decodingHandler = (() => {
  const element = document.createElement('div');
  return text => {
    element.innerHTML = text;
    return element.textContent;
  };
})();

A more concise way

const decodingHandler = (() => {
  const element = document.createElement('div');
  return text => ((element.innerHTML = text), element.textContent);
})();

_x000D_
_x000D_
function decodeHTMLContent(htmlText) {
  var txt = document.createElement("span");
  txt.innerHTML = htmlText;
  return txt.innerText;
}

var result = decodeHTMLContent('One &amp; two &amp; three');
console.log(result);
_x000D_
_x000D_
_x000D_


There is an variant that 80% as productive as the answers at the very top.

See the benchmark: https://jsperf.com/decode-html12345678/1

performance test

_x000D_
_x000D_
console.log(decodeEntities('test: &gt'));_x000D_
_x000D_
function decodeEntities(str) {_x000D_
  // this prevents any overhead from creating the object each time_x000D_
  const el = decodeEntities.element || document.createElement('textarea')_x000D_
_x000D_
  // strip script/html tags_x000D_
  el.innerHTML = str_x000D_
    .replace(/<script[^>]*>([\S\s]*?)<\/script>/gmi, '')_x000D_
    .replace(/<\/?\w(?:[^"'>]|"[^"]*"|'[^']*')*>/gmi, '');_x000D_
_x000D_
  return el.value;_x000D_
}
_x000D_
_x000D_
_x000D_

If you need to leave tags, then remove the two .replace(...) calls (you can leave the first one if you do not need scripts).


Chris answer is nice & elegant but it fails if value is undefined. Just simple improvement makes it solid:

function htmlDecode(value) {
   return (typeof value === 'undefined') ? '' : $('<div/>').html(value).text();
}

The trick is to use the power of the browser to decode the special HTML characters, but not allow the browser to execute the results as if it was actual html... This function uses a regex to identify and replace encoded HTML characters, one character at a time.

function unescapeHtml(html) {
    var el = document.createElement('div');
    return html.replace(/\&[#0-9a-z]+;/gi, function (enc) {
        el.innerHTML = enc;
        return el.innerText
    });
}

The question doesn't specify the origin of x but it makes sense to defend, if we can, against malicious (or just unexpected, from our own application) input. For example, suppose x has a value of &amp; <script>alert('hello');</script>. A safe and simple way to handle this in jQuery is:

var x    = "&amp; <script>alert('hello');</script>";
var safe = $('<div />').html(x).text();

// => "& alert('hello');"

Found via https://gist.github.com/jmblog/3222899. I can't see many reasons to avoid using this solution given it is at least as short, if not shorter than some alternatives and provides defence against XSS.

(I originally posted this as a comment, but am adding it as an answer since a subsequent comment in the same thread requested that I do so).


First create a <span id="decodeIt" style="display:none;"></span> somewhere in the body

Next, assign the string to be decoded as innerHTML to this:

document.getElementById("decodeIt").innerHTML=stringtodecode

Finally,

stringtodecode=document.getElementById("decodeIt").innerText

Here is the overall code:

var stringtodecode="<B>Hello</B> world<br>";
document.getElementById("decodeIt").innerHTML=stringtodecode;
stringtodecode=document.getElementById("decodeIt").innerText

var encodedStr = 'hello &amp; world';

var parser = new DOMParser;
var dom = parser.parseFromString(
    '<!doctype html><body>' + encodedStr,
    'text/html');
var decodedString = dom.body.textContent;

console.log(decodedString);

I was crazy enough to go through and make this function that should be pretty, if not completely, exhaustive:

function removeEncoding(string) {
    return string.replace(/&Agrave;/g, "À").replace(/&Aacute;/g, "Á").replace(/&Acirc;/g, "Â").replace(/&Atilde;/g, "Ã").replace(/&Auml;/g, "Ä").replace(/&Aring;/g, "Å").replace(/&agrave;/g, "à").replace(/&acirc;/g, "â").replace(/&atilde;/g, "ã").replace(/&auml;/g, "ä").replace(/&aring;/g, "å").replace(/&AElig;/g, "Æ").replace(/&aelig;/g, "æ").replace(/&szlig;/g, "ß").replace(/&Ccedil;/g, "Ç").replace(/&ccedil;/g, "ç").replace(/&Egrave;/g, "È").replace(/&Eacute;/g, "É").replace(/&Ecirc;/g, "Ê").replace(/&Euml;/g, "Ë").replace(/&egrave;/g, "è").replace(/&eacute;/g, "é").replace(/&ecirc;/g, "ê").replace(/&euml;/g, "ë").replace(/&#131;/g, "ƒ").replace(/&Igrave;/g, "Ì").replace(/&Iacute;/g, "Í").replace(/&Icirc;/g, "Î").replace(/&Iuml;/g, "Ï").replace(/&igrave;/g, "ì").replace(/&iacute;/g, "í").replace(/&icirc;/g, "î").replace(/&iuml;/g, "ï").replace(/&Ntilde;/g, "Ñ").replace(/&ntilde;/g, "ñ").replace(/&Ograve;/g, "Ò").replace(/&Oacute;/g, "Ó").replace(/&Ocirc;/g, "Ô").replace(/&Otilde;/g, "Õ").replace(/&Ouml;/g, "Ö").replace(/&ograve;/g, "ò").replace(/&oacute;/g, "ó").replace(/&ocirc;/g, "ô").replace(/&otilde;/g, "õ").replace(/&ouml;/g, "ö").replace(/&Oslash;/g, "Ø").replace(/&oslash;/g, "ø").replace(/&#140;/g, "Œ").replace(/&#156;/g, "œ").replace(/&#138;/g, "Š").replace(/&#154;/g, "š").replace(/&Ugrave;/g, "Ù").replace(/&Uacute;/g, "Ú").replace(/&Ucirc;/g, "Û").replace(/&Uuml;/g, "Ü").replace(/&ugrave;/g, "ù").replace(/&uacute;/g, "ú").replace(/&ucirc;/g, "û").replace(/&uuml;/g, "ü").replace(/&#181;/g, "µ").replace(/&#215;/g, "×").replace(/&Yacute;/g, "Ý").replace(/&#159;/g, "Ÿ").replace(/&yacute;/g, "ý").replace(/&yuml;/g, "ÿ").replace(/&#176;/g, "°").replace(/&#134;/g, "†").replace(/&#135;/g, "‡").replace(/&lt;/g, "<").replace(/&gt;/g, ">").replace(/&#177;/g, "±").replace(/&#171;/g, "«").replace(/&#187;/g, "»").replace(/&#191;/g, "¿").replace(/&#161;/g, "¡").replace(/&#183;/g, "·").replace(/&#149;/g, "•").replace(/&#153;/g, "™").replace(/&copy;/g, "©").replace(/&reg;/g, "®").replace(/&#167;/g, "§").replace(/&#182;/g, "¶").replace(/&Alpha;/g, "?").replace(/&Beta;/g, "?").replace(/&Gamma;/g, "G").replace(/&Delta;/g, "?").replace(/&Epsilon;/g, "?").replace(/&Zeta;/g, "?").replace(/&Eta;/g, "?").replace(/&Theta;/g, "T").replace(/&Iota;/g, "?").replace(/&Kappa;/g, "?").replace(/&Lambda;/g, "?").replace(/&Mu;/g, "?").replace(/&Nu;/g, "?").replace(/&Xi;/g, "?").replace(/&Omicron;/g, "?").replace(/&Pi;/g, "?").replace(/&Rho;/g, "?").replace(/&Sigma;/g, "S").replace(/&Tau;/g, "?").replace(/&Upsilon;/g, "?").replace(/&Phi;/g, "F").replace(/&Chi;/g, "?").replace(/&Psi;/g, "?").replace(/&Omega;/g, "O").replace(/&alpha;/g, "a").replace(/&beta;/g, "ß").replace(/&gamma;/g, "?").replace(/&delta;/g, "d").replace(/&epsilon;/g, "e").replace(/&zeta;/g, "?").replace(/&eta;/g, "?").replace(/&theta;/g, "?").replace(/&iota;/g, "?").replace(/&kappa;/g, "?").replace(/&lambda;/g, "?").replace(/&mu;/g, "µ").replace(/&nu;/g, "?").replace(/&xi;/g, "?").replace(/&omicron;/g, "?").replace(/&pi?;/g, "?").replace(/&rho;/g, "?").replace(/&sigmaf;/g, "?").replace(/&sigma;/g, "s").replace(/&tau;/g, "t").replace(/&phi;/g, "f").replace(/&chi;/g, "?").replace(/&psi;/g, "?").replace(/&omega;/g, "?").replace(/&bull;/g, "•").replace(/&hellip;/g, "…").replace(/&prime;/g, "'").replace(/&Prime;/g, """).replace(/&oline;/g, "?").replace(/&frasl;/g, "/").replace(/&weierp;/g, "P").replace(/&image;/g, "I").replace(/&real;/g, "R").replace(/&trade;/g, "™").replace(/&alefsym;/g, "?").replace(/&larr;/g, "?").replace(/&uarr;/g, "?").replace(/&rarr;/g, "?").replace(/&darr;/g, "?").replace(/&barr;/g, "?").replace(/&crarr;/g, "?").replace(/&lArr;/g, "?").replace(/&uArr;/g, "?").replace(/&rArr;/g, "?").replace(/&dArr;/g, "?").replace(/&hArr;/g, "?").replace(/&forall;/g, "?").replace(/&part;/g, "?").replace(/&exist;/g, "?").replace(/&empty;/g, "Ø").replace(/&nabla;/g, "?").replace(/&isin;/g, "?").replace(/&notin;/g, "?").replace(/&ni;/g, "?").replace(/&prod;/g, "?").replace(/&sum;/g, "?").replace(/&minus;/g, "-").replace(/&lowast;/g, "*").replace(/&radic;/g, "v").replace(/&prop;/g, "?").replace(/&infin;/g, "8").replace(/&OEig;/g, "Œ").replace(/&oelig;/g, "œ").replace(/&Yuml;/g, "Ÿ").replace(/&spades;/g, "?").replace(/&clubs;/g, "?").replace(/&hearts;/g, "?").replace(/&diams;/g, "?").replace(/&thetasym;/g, "?").replace(/&upsih;/g, "?").replace(/&piv;/g, "?").replace(/&Scaron;/g, "Š").replace(/&scaron;/g, "š").replace(/&ang;/g, "?").replace(/&and;/g, "?").replace(/&or;/g, "?").replace(/&cap;/g, "n").replace(/&cup;/g, "?").replace(/&int;/g, "?").replace(/&there4;/g, "?").replace(/&sim;/g, "~").replace(/&cong;/g, "?").replace(/&asymp;/g, "˜").replace(/&ne;/g, "?").replace(/&equiv;/g, "=").replace(/&le;/g, "=").replace(/&ge;/g, "=").replace(/&sub;/g, "?").replace(/&sup;/g, "?").replace(/&nsub;/g, "?").replace(/&sube;/g, "?").replace(/&supe;/g, "?").replace(/&oplus;/g, "?").replace(/&otimes;/g, "?").replace(/&perp;/g, "?").replace(/&sdot;/g, "·").replace(/&lcell;/g, "?").replace(/&rcell;/g, "?").replace(/&lfloor;/g, "?").replace(/&rfloor;/g, "?").replace(/&lang;/g, "?").replace(/&rang;/g, "?").replace(/&loz;/g, "?").replace(/&#039;/g, "'").replace(/&amp;/g, "&").replace(/&quot;/g, "\"");
}

Used like so:

let decodedText = removeEncoding("Ich hei&szlig;e David");
console.log(decodedText);

Prints: Ich Heiße David

P.S. this took like an hour and a half to make.


You're welcome...just a messenger...full credit goes to ourcodeworld.com, link below.

window.htmlentities = {
        /**
         * Converts a string to its html characters completely.
         *
         * @param {String} str String with unescaped HTML characters
         **/
        encode : function(str) {
            var buf = [];

            for (var i=str.length-1;i>=0;i--) {
                buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
            }

            return buf.join('');
        },
        /**
         * Converts an html characterSet into its original character.
         *
         * @param {String} str htmlSet entities
         **/
        decode : function(str) {
            return str.replace(/&#(\d+);/g, function(match, dec) {
                return String.fromCharCode(dec);
            });
        }
    };

Full Credit: https://ourcodeworld.com/articles/read/188/encode-and-decode-html-entities-using-pure-javascript


All of the other answers here have problems.

The document.createElement('div') methods (including those using jQuery) execute any javascript passed into it (a security issue) and the DOMParser.parseFromString() method trims whitespace. Here is a pure javascript solution that has neither problem:

function htmlDecode(html) {
    var textarea = document.createElement("textarea");
    html= html.replace(/\r/g, String.fromCharCode(0xe000)); // Replace "\r" with reserved unicode character.
    textarea.innerHTML = html;
    var result = textarea.value;
    return result.replace(new RegExp(String.fromCharCode(0xe000), 'g'), '\r');
}

TextArea is used specifically to avoid executig js code. It passes these:

htmlDecode('&lt;&amp;&nbsp;&gt;'); // returns "<& >" with non-breaking space.
htmlDecode('  '); // returns "  "
htmlDecode('<img src="dummy" onerror="alert(\'xss\')">'); // Does not execute alert()
htmlDecode('\r\n') // returns "\r\n", doesn't lose the \r like other solutions.

Most answers given here have a huge disadvantage: if the string you are trying to convert isn't trusted then you will end up with a Cross-Site Scripting (XSS) vulnerability. For the function in the accepted answer, consider the following:

htmlDecode("<img src='dummy' onerror='alert(/xss/)'>");

The string here contains an unescaped HTML tag, so instead of decoding anything the htmlDecode function will actually run JavaScript code specified inside the string.

This can be avoided by using DOMParser which is supported in all modern browsers:

_x000D_
_x000D_
function htmlDecode(input) {_x000D_
  var doc = new DOMParser().parseFromString(input, "text/html");_x000D_
  return doc.documentElement.textContent;_x000D_
}_x000D_
_x000D_
console.log(  htmlDecode("&lt;img src='myimage.jpg'&gt;")  )    _x000D_
// "<img src='myimage.jpg'>"_x000D_
_x000D_
console.log(  htmlDecode("<img src='dummy' onerror='alert(/xss/)'>")  )  _x000D_
// ""
_x000D_
_x000D_
_x000D_

This function is guaranteed to not run any JavaScript code as a side-effect. Any HTML tags will be ignored, only text content will be returned.

Compatibility note: Parsing HTML with DOMParser requires at least Chrome 30, Firefox 12, Opera 17, Internet Explorer 10, Safari 7.1 or Microsoft Edge. So all browsers without support are way past their EOL and as of 2017 the only ones that can still be seen in the wild occasionally are older Internet Explorer and Safari versions (usually these still aren't numerous enough to bother).


Matthias Bynens has a library for this: https://github.com/mathiasbynens/he

Example:

console.log(
    he.decode("J&#246;rg &amp J&#xFC;rgen rocked to &amp; fro ")
);
// Logs "Jörg & Jürgen rocked to & fro"

I suggest favouring it over hacks involving setting an element's HTML content and then reading back its text content. Such approaches can work, but are deceptively dangerous and present XSS opportunities if used on untrusted user input.

If you really can't bear to load in a library, you can use the textarea hack described in this answer to a near-duplicate question, which, unlike various similar approaches that have been suggested, has no security holes that I know of:

function decodeEntities(encodedString) {
    var textArea = document.createElement('textarea');
    textArea.innerHTML = encodedString;
    return textArea.value;
}

console.log(decodeEntities('1 &amp; 2')); // '1 & 2'

But take note of the security issues, affecting similar approaches to this one, that I list in the linked answer! This approach is a hack, and future changes to the permissible content of a textarea (or bugs in particular browsers) could lead to code that relies upon it suddenly having an XSS hole one day.


Not a direct response to your question, but wouldn't it be better for your RPC to return some structure (be it XML or JSON or whatever) with those image data (urls in your example) inside that structure?

Then you could just parse it in your javascript and build the <img> using javascript itself.

The structure you recieve from RPC could look like:

{"img" : ["myimage.jpg", "myimage2.jpg"]}

I think it's better this way, as injecting a code that comes from external source into your page doesn't look very secure. Imaging someone hijacking your XML-RPC script and putting something you wouldn't want in there (even some javascript...)


jQuery will encode and decode for you. However, you need to use a textarea tag, not a div.

_x000D_
_x000D_
var str1 = 'One & two & three';_x000D_
var str2 = "One &amp; two &amp; three";_x000D_
  _x000D_
$(document).ready(function() {_x000D_
   $("#encoded").text(htmlEncode(str1)); _x000D_
   $("#decoded").text(htmlDecode(str2));_x000D_
});_x000D_
_x000D_
function htmlDecode(value) {_x000D_
  return $("<textarea/>").html(value).text();_x000D_
}_x000D_
_x000D_
function htmlEncode(value) {_x000D_
  return $('<textarea/>').text(value).html();_x000D_
}
_x000D_
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script>_x000D_
_x000D_
<div id="encoded"></div>_x000D_
<div id="decoded"></div>
_x000D_
_x000D_
_x000D_


This is the most comprehensive solution I've tried so far:

const STANDARD_HTML_ENTITIES = {
    nbsp: String.fromCharCode(160),
    amp: "&",
    quot: '"',
    lt: "<",
    gt: ">"
};

const replaceHtmlEntities = plainTextString => {
    return plainTextString
        .replace(/&#(\d+);/g, (match, dec) => String.fromCharCode(dec))
        .replace(
            /&(nbsp|amp|quot|lt|gt);/g,
            (a, b) => STANDARD_HTML_ENTITIES[b]
        );
};

In case you're looking for it, like me - meanwhile there's a nice and safe JQuery method.

https://api.jquery.com/jquery.parsehtml/

You can f.ex. type this in your console:

var x = "test &amp;";
> undefined
$.parseHTML(x)[0].textContent
> "test &"

So $.parseHTML(x) returns an array, and if you have HTML markup within your text, the array.length will be greater than 1.


Do you need to decode all encoded HTML entities or just &amp; itself?

If you only need to handle &amp; then you can do this:

var decoded = encoded.replace(/&amp;/g, '&');

If you need to decode all HTML entities then you can do it without jQuery:

var elem = document.createElement('textarea');
elem.innerHTML = encoded;
var decoded = elem.value;

Please take note of Mark's comments below which highlight security holes in an earlier version of this answer and recommend using textarea rather than div to mitigate against potential XSS vulnerabilities. These vulnerabilities exist whether you use jQuery or plain JavaScript.


If you're using jQuery:

function htmlDecode(value){ 
  return $('<div/>').html(value).text(); 
}

Otherwise, use Strictly Software's Encoder Object, which has an excellent htmlDecode() function.


A more modern option for interpreting HTML (text and otherwise) from JavaScript is the HTML support in the DOMParser API (see here in MDN). This allows you to use the browser's native HTML parser to convert a string to an HTML document. It has been supported in new versions of all major browsers since late 2014.

If we just want to decode some text content, we can put it as the sole content in a document body, parse the document, and pull out the its .body.textContent.

_x000D_
_x000D_
var encodedStr = 'hello &amp; world';_x000D_
_x000D_
var parser = new DOMParser;_x000D_
var dom = parser.parseFromString(_x000D_
    '<!doctype html><body>' + encodedStr,_x000D_
    'text/html');_x000D_
var decodedString = dom.body.textContent;_x000D_
_x000D_
console.log(decodedString);
_x000D_
_x000D_
_x000D_

We can see in the draft specification for DOMParser that JavaScript is not enabled for the parsed document, so we can perform this text conversion without security concerns.

The parseFromString(str, type) method must run these steps, depending on type:

  • "text/html"

    Parse str with an HTML parser, and return the newly created Document.

    The scripting flag must be set to "disabled".

    NOTE

    script elements get marked unexecutable and the contents of noscript get parsed as markup.

It's beyond the scope of this question, but please note that if you're taking the parsed DOM nodes themselves (not just their text content) and moving them to the live document DOM, it's possible that their scripting would be reenabled, and there could be security concerns. I haven't researched it, so please exercise caution.


Examples related to javascript

need to add a class to an element How to make a variable accessible outside a function? Hide Signs that Meteor.js was Used How to create a showdown.js markdown extension Please help me convert this script to a simple image slider Highlight Anchor Links when user manually scrolls? Summing radio input values How to execute an action before close metro app WinJS javascript, for loop defines a dynamic variable name Getting all files in directory with ajax

Examples related to html

Embed ruby within URL : Middleman Blog Please help me convert this script to a simple image slider Generating a list of pages (not posts) without the index file Why there is this "clear" class before footer? Is it possible to change the content HTML5 alert messages? Getting all files in directory with ajax DevTools failed to load SourceMap: Could not load content for chrome-extension How to set width of mat-table column in angular? How to open a link in new tab using angular? ERROR Error: Uncaught (in promise), Cannot match any routes. URL Segment

Examples related to escaping

Uses for the '&quot;' entity in HTML Javascript - How to show escape characters in a string? How to print a single backslash? How to escape special characters of a string with single backslashes Saving utf-8 texts with json.dumps as UTF8, not as \u escape sequence Properly escape a double quote in CSV How to Git stash pop specific stash in 1.8.3? In Java, should I escape a single quotation mark (') in String (double quoted)? How do I escape a single quote ( ' ) in JavaScript? Which characters need to be escaped when using Bash?

Examples related to xml-rpc

Unescape HTML entities in Javascript? Fatal Error: Allowed Memory Size of 134217728 Bytes Exhausted (CodeIgniter + XML-RPC)