Say I get some JSON back from a service request that looks like this:
{
"message": "We're unable to complete your request at this time."
}
I'm not sure why that apostraphe is encoded like that ('
); all I know is that I want to decode it.
Here's one approach using jQuery that popped into my head:
function decodeHtml(html) {
return $('<div>').html(html).text();
}
That seems (very) hacky, though. What's a better way? Is there a "right" way?
This question is related to
javascript
jquery
html-entities
This is so good answer. You can use this with angular like this:
moduleDefinitions.filter('sanitize', ['$sce', function($sce) {
return function(htmlCode) {
var txt = document.createElement("textarea");
txt.innerHTML = htmlCode;
return $sce.trustAsHtml(txt.value);
}
}]);
Don’t use the DOM to do this. Using the DOM to decode HTML entities (as suggested in the currently accepted answer) leads to differences in cross-browser results.
For a robust & deterministic solution that decodes character references according to the algorithm in the HTML Standard, use the he library. From its README:
he (for “HTML entities”) is a robust HTML entity encoder/decoder written in JavaScript. It supports all standardized named character references as per HTML, handles ambiguous ampersands and other edge cases just like a browser would, has an extensive test suite, and — contrary to many other JavaScript solutions — he handles astral Unicode symbols just fine. An online demo is available.
Here’s how you’d use it:
he.decode("We're unable to complete your request at this time.");
? "We're unable to complete your request at this time."
Disclaimer: I'm the author of the he library.
See this Stack Overflow answer for some more info.
jQuery will encode and decode for you.
function htmlDecode(value) {_x000D_
return $("<textarea/>").html(value).text();_x000D_
}_x000D_
_x000D_
function htmlEncode(value) {_x000D_
return $('<textarea/>').text(value).html();_x000D_
}
_x000D_
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script>_x000D_
<script>_x000D_
$(document).ready(function() {_x000D_
$("#encoded")_x000D_
.text(htmlEncode("<img src onerror='alert(0)'>"));_x000D_
$("#decoded")_x000D_
.text(htmlDecode("<img src onerror='alert(0)'>"));_x000D_
});_x000D_
</script>_x000D_
_x000D_
<span>htmlEncode() result:</span><br/>_x000D_
<div id="encoded"></div>_x000D_
<br/>_x000D_
<span>htmlDecode() result:</span><br/>_x000D_
<div id="decoded"></div>
_x000D_
_.unescape
does what you're looking for
There's JS function to deal with &#xxxx styled entities:
function at GitHub
// encode(decode) html text into html entity
var decodeHtmlEntity = function(str) {
return str.replace(/&#(\d+);/g, function(match, dec) {
return String.fromCharCode(dec);
});
};
var encodeHtmlEntity = function(str) {
var buf = [];
for (var i=str.length-1;i>=0;i--) {
buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
}
return buf.join('');
};
var entity = '高级程序设计';
var str = '??????';
console.log(decodeHtmlEntity(entity) === str);
console.log(encodeHtmlEntity(str) === entity);
// output:
// true
// true
If you don't want to use html/dom, you could use regex. I haven't tested this; but something along the lines of:
function parseHtmlEntities(str) {
return str.replace(/&#([0-9]{1,3});/gi, function(match, numStr) {
var num = parseInt(numStr, 10); // read num as normal number
return String.fromCharCode(num);
});
}
Note: this would only work for numeric html-entities, and not stuff like &oring;.
Fixed the function (some typos), test here: http://jsfiddle.net/Be2Bd/1/
Source: Stackoverflow.com