HTML-encoding lost when attribute read from input field

Question

I   m using JavaScript to pull a value out from a hidden field and display it in a textbox  The value in the hidden field is encoded   For example    lt input id  hiddenId  type  hidden  value  chalk  amp amp  cheese    gt    gets pulled into   lt input type  text  value  chalk  amp amp  cheese    gt    via some jQuery to get the value from the hidden field  it   s at this point that I lose the encoding        hiddenId   attr  value     The problem is that when I read chalk  amp amp  cheese from the hidden field  JavaScript seems to lose the encoding  I do not want the value to be chalk  amp  cheese  I want the literal amp  to be retained   Is there a JavaScript library or a jQuery method that will HTML-encode a string

User · Answer

My pure-JS function          HTML entities encode        param  string  str Input text     return  string  Filtered text     function htmlencode  str      var div   document createElement  div      div appendChild document createTextNode str      return div innerHTML         JavaScript HTML Entities Encode  amp  Decode

User · Answer

HtmlEncodes the given value    var htmlEncodeContainer       lt div   gt       function htmlEncode value        if  value          return htmlEncodeContainer text value  html          else         return

User · Answer

lt script gt  String prototype htmlEncode   function          return String this           replace   amp  g    amp amp             replace    g    amp quot             replace    g    amp  39             replace   lt  g    amp lt             replace   gt  g    amp gt         var aString     lt script gt alert  I hack your site   lt  script gt    console log aString htmlEncode      lt  script gt    Will output   amp lt script amp gt alert  amp quot I hack your site amp quot   amp lt  script amp gt    htmlEncode   will be accessible on all strings once defined

User · Answer

var htmlEnDeCode    function         var charToEntityRegex          entityToCharRegex          charToEntity          entityToChar       function resetCharacterEntities             charToEntity               entityToChar                  add the default set         addCharacterEntities                 amp amp             amp                  amp gt              gt                  amp lt              lt                  amp quot                             amp  39                                      function addCharacterEntities newEntities            var charKeys                   entityKeys                   key  echar          for  key in newEntities                echar   newEntities key               entityToChar key    echar              charToEntity echar    key              charKeys push echar               entityKeys push key                     charToEntityRegex   new RegExp       charKeys join              g            entityToCharRegex   new RegExp       entityKeys join           amp   0-9  1 5            g               function htmlEncode value           var htmlEncodeReplaceFn   function match  capture                return charToEntity capture                       return   value    value   String value  replace charToEntityRegex  htmlEncodeReplaceFn              function htmlDecode value            var htmlDecodeReplaceFn   function match  capture                return  capture in entityToChar    entityToChar capture    String fromCharCode parseInt capture substr 2   10                        return   value    value   String value  replace entityToCharRegex  htmlDecodeReplaceFn              resetCharacterEntities         return           htmlEncode  htmlEncode          htmlDecode  htmlDecode                This is from ExtJS source code

User · Answer

The jQuery trick doesn t encode quote marks and in IE it will strip your whitespace   Based on the escape templatetag in Django  which I guess is heavily used tested already  I made this function which does what s needed   It s arguably simpler  and possibly faster  than any of the workarounds for the whitespace-stripping issue - and it encodes quote marks  which is essential if you re going to use the result inside an attribute value for example   function htmlEscape str        return str          replace   amp  g    amp amp             replace    g    amp quot             replace    g    amp  39             replace   lt  g    amp lt             replace   gt  g    amp gt           I needed the opposite function today  so adding here too  function htmlUnescape str       return str          replace   amp quot  g                replace   amp  39  g                replace   amp lt  g    lt             replace   amp gt  g    gt             replace   amp amp  g    amp         Update 2013-06-17  In the search for the fastest escaping I have found this implementation of a replaceAll method  http   dumpsite com forum index php topic 4 msg29 msg29  also referenced here  Fastest method to replace all instances of a character in a string  Some performance results here  http   jsperf com htmlencoderegex 25  It gives identical result string to the builtin replace chains above  I d be very happy if someone could explain why it s faster    Update 2015-03-04  I just noticed that AngularJS are using exactly the method above  https   github com angular angular js blob v1 3 14 src ngSanitize sanitize js L435  They add a couple of refinements - they appear to be handling an obscure Unicode issue as well as converting all non-alphanumeric characters to entities  I was under the impression the latter was not necessary as long as you have an UTF8 charset specified for your document   I will note that  4 years later  Django still does not do either of these things  so I m not sure how important they are  https   github com django django blob 1 8b1 django utils html py L44  Update 2016-04-06  You may also wish to escape forward-slash    This is not required for correct HTML encoding  however it is recommended by OWASP as an anti-XSS safety measure   thanks to  JNF for suggesting this in comments            replace     g    amp  x2F

User · Answer

I had a similar problem and solve it using the function encodeURIComponent from JavaScript  documentation   For example  in your case if you use    lt input id  hiddenId  type  hidden  value  chalk  amp  cheese    gt    and  encodeURIComponent     hiddenId   attr  value      you will get chalk 20 26 20cheese  Even spaces are kept   In my case  I had to encode one backslash and this code works perfectly  encodeURIComponent  name surname     and I got name 2Fsurname

User · Answer

As far as I know there isn t any straight forward HTML Encode Decode method in javascript   However  what you can do  is to use JS to create an arbitrary element  set its inner text  then read it using innerHTML    Let s say  with jQuery  this should work   var helper      chalk  amp  cheese   hide   appendTo  body    var htmled   helper html    helper remove      Or something along these lines

User · Answer

For those who prefer plain javascript  here is the method I have used successfully   function escapeHTML  str        var div   document createElement  div        var text   document createTextNode str       div appendChild text       return div innerHTML

User · Answer

Using some of the other answers here I made a version that replaces all the pertinent characters in one pass irrespective of the number of distinct encoded characters  only one call to replace    so will be faster for larger strings   It doesn t rely on the DOM API to exist or on other libraries     window encodeHTML    function         function escapeRegex s            return s replace   -                    g       amp               var encodings               amp        amp amp                     amp quot                     amp  39              lt        amp lt              gt        amp gt                     amp  x2F              function encode what    return encodings what          var specialChars   new RegExp               escapeRegex Object keys encodings  join                  g         return function text    return text replace specialChars  encode              Having ran that once  you can now call  encodeHTML   lt  gt  amp         To get  amp lt  amp gt  amp amp  amp quot  amp  39

User · Answer

I ran into some issues with backslash in my Domain User string   I added this to the other escapes from Anentropic s answer   replace     g    amp  92      Which I found here  How to escape backslash in JavaScript

User · Answer

x000D   x000D  function encodeHTML str    x000D      return document createElement  a   appendChild   x000D          document createTextNode str   parentNode innerHTML  x000D     x000D   x000D  function decodeHTML str    x000D      var element   document createElement  a     x000D      element innerHTML   str  x000D      return element textContent  x000D     x000D  var str     lt   x000D  var enc   encodeHTML str   x000D  var dec   decodeHTML enc   x000D  console log  str      str    nenc      enc    ndec      dec   x000D   x000D   x000D

User · Answer

Here is a simple javascript solution  It extends String object with a method  HTMLEncode  which can be used on an object without parameter  or with a parameter   String prototype HTMLEncode   function str      var result         var str    arguments length   1    str   this    for var i 0  i lt str length  i           var chrcode   str charCodeAt i        result   chrcode gt 128      amp    chrcode       str substr i 1          return result       TEST console log  stetaewteaw       HTMLEncode     console log  stetaewteaw       HTMLEncode                    I have made a gist  HTMLEncode method for javascript

User · Answer

FWIW  the encoding is not being lost   The encoding is used by the markup parser  browser  during the page load   Once the source is read and parsed and the browser has the DOM loaded into memory  the encoding has been parsed into what it represents  So by the time your JS is execute to read anything in memory  the char it gets is what the encoding represented   I may be operating strictly on semantics here  but I wanted you to understand the purpose of encoding   The word  lost  makes it sound like something isn t working like it should

User · Answer

Here s a non-jQuery version that is considerably faster than both the jQuery  html   version and the  replace   version  This preserves all whitespace  but like the jQuery version  doesn t handle quotes   function htmlEncode  html         return document createElement   a    appendChild           document createTextNode  html     parentNode innerHTML       Speed  http   jsperf com htmlencoderegex 17        Demo    Output     Script   function htmlEncode  html         return document createElement   a    appendChild           document createTextNode  html     parentNode innerHTML      function htmlDecode  html         var a   document createElement   a     a innerHTML   html      return a textContent      document getElementById   text    value   htmlEncode  document getElementById   hidden    value       sanity check var html     lt div gt     amp amp  hello lt  div gt    document getElementById   same    textContent           html     htmlDecode  htmlEncode  html                 html     htmlDecode  htmlEncode  html          HTML    lt input id  hidden  type  hidden  value  chalk     amp amp  cheese    gt   lt input id  text  value      gt   lt div id  same  gt  lt  div gt

User · Answer

If you want to use jQuery  I found this   http   www jquerysdk com api jQuery htmlspecialchars   part of jquery string plugin offered by jQuery SDK   The problem with Prototype I believe is that it extends base objects in JavaScript and will be incompatible with any jQuery you may have used  Of course  if you are already using Prototype and not jQuery  it won t be a problem   EDIT  Also there is this  which is a port of Prototype s string utilities for jQuery   http   stilldesigning com dotstring

User · Answer

Good answer  Note that if the value to encode is undefined or null with jQuery 1 4 2 you might get errors such as   jQuery   lt div  gt    text value  html is not a function  OR  Uncaught TypeError  Object  has no method  html   The solution is to modify the function to check for an actual value   function htmlEncode value        if  value            return jQuery   lt div  gt    text value  html           else           return

User · Answer

Prototype has it built-in the String class  So if you are using plan to use Prototype  it does something like     lt div class  article  gt This is an article lt  div gt   escapeHTML       - gt    amp lt div class  article  amp gt This is an article amp lt  div amp gt

User · Answer

You shouldn t have to escape encode values in order to shuttle them from one input field to another    lt form gt    lt input id  button  type  button  value  Click me  gt    lt input type  hidden  id  hiddenId  name  hiddenId  value  I like cheese  gt    lt input type  text  id  output  name  output  gt   lt  form gt   lt script gt        document  ready function e                button   click function e                    output   val     hiddenId   val                          lt  script gt    JS doesn t go inserting raw HTML or anything  it just tells the DOM to set the value property  or attribute  not sure    Either way  the DOM handles any encoding issues for you   Unless you re doing something odd like using document write or eval  HTML-encoding will be effectively transparent   If you re talking about generating a new textbox to hold the result   it s still as easy   Just pass the static part of the HTML to jQuery  and then set the rest of the properties attributes on the object it returns to you    box       lt input type  text  name  whatever  gt    val     hiddenId   val

User · Answer

Underscore provides   escape   and   unescape   methods that do this    gt    unescape   chalk  amp amp  cheese        chalk  amp  cheese    gt    escape   chalk  amp  cheese        chalk  amp amp  cheese

User · Answer

Picking what escapeHTML   is doing in the prototype js  Adding this script helps you escapeHTML   String prototype escapeHTML   function          return this replace   amp  g   amp amp    replace   lt  g   amp lt    replace   gt  g   amp gt        now you can call escapeHTML method on strings in your script  like   var escapedString     lt h1 gt this is HTML lt  h1 gt   escapeHTML       gives    amp lt h1 amp gt this is HTML amp lt  h1 amp gt     Hope it helps anyone looking for a simple solution without having to include the entire prototype js

User · Answer

Faster without Jquery  You can encode every character in your string   function encode e  return e replace      g function e  return  amp    e charCodeAt 0           Or just target the main characters to worry about   amp   inebreaks   lt        and    like    x000D   x000D  function encode r   x000D  return r replace    x26 x0A  lt  gt     g function r  return  amp    r charCodeAt 0        x000D    x000D   x000D  test value encode  Encode HTML entities  n n Safe  escape  lt script id      gt   amp  useful in  lt pre gt  tags     x000D   x000D  testing innerHTML test value  x000D   x000D                 x000D     x26 is  amp ampersand  it has to be first   x000D     x0A is newline  x000D                 x000D   lt textarea id test rows  9  cols  55  gt  lt  textarea gt  x000D   x000D   lt div id  testing  gt www WHAK com lt  div gt  x000D   x000D   x000D

User · Answer

I know this is an old one  but I wanted to post a variation of the accepted answer that will work in IE without removing lines   function multiLineHtmlEncode value        var lines   value split   r n  r  n        for  var i   0  i  lt  lines length  i              lines i    htmlEncode lines i              return lines join   r n       function htmlEncode value        return     lt div  gt    text value  html

User · Answer

EDIT  This answer was posted a long ago  and the htmlDecode function introduced a XSS vulnerability  It has been modified changing the temporary element from a div to a textarea reducing the XSS chance  But nowadays  I would encourage you to use the DOMParser API as suggested in other anwswer     I use these functions   function htmlEncode value        Create a in-memory element  set its inner text  which is automatically encoded       Then grab the encoded contents back out  The element never exists on the DOM    return     lt textarea  gt    text value  html       function htmlDecode value     return     lt textarea  gt    html value  text        Basically a textarea element is created in memory  but it is never appended to the document   On the htmlEncode function I set the innerText of the element  and retrieve the encoded innerHTML  on the htmlDecode function I set the innerHTML value of the element and the innerText is retrieved   Check a running example here

User · Answer

Here s a little bit that emulates the Server HTMLEncode function from Microsoft s ASP  written in pure JavaScript    x000D   x000D  function htmlEncode s    x000D    var ntable     x000D        amp     amp   x000D        lt     lt   x000D        gt     gt   x000D             quot  x000D       x000D    s   s replace    amp  lt  gt    g  function ch    x000D      return   amp     ntable ch         x000D       x000D    s   s replace     - x7e  g  function ch    x000D      return   amp      ch charCodeAt 0  toString          x000D        x000D    return s  x000D    x000D   x000D   x000D    The result does not encode apostrophes  but encodes the other HTML specials and any character outside the 0x20-0x7e range

User · Answer

Based on angular s sanitize     es6 module syntax      ref  https   github com angular angular js blob v1 3 14 src ngSanitize sanitize js const SURROGATE PAIR REGEXP      uD800- uDBFF   uDC00- uDFFF  g  const NON ALPHANUMERIC REGEXP         -        g   const decodeElem   document createElement  pre             Decodes html encoded text  so that the actual string may    be used      param value     returns  string  decoded text     export function decode value      if   value  return       decodeElem innerHTML   value replace   lt  g    amp lt       return decodeElem textContent             Encodes all potentially dangerous characters  so that the    resulting string can be safely inserted into attribute or    element text      param value     returns  string  encoded text     export function encode value      if  value     null    value     undefined  return       return String value       replace   amp  g    amp amp         replace SURROGATE PAIR REGEXP  value   gt          var hi   value charCodeAt 0         var low   value charCodeAt 1         return   amp         hi - 0xD800    0x400     low - 0xDC00    0x10000                     replace NON ALPHANUMERIC REGEXP  value   gt          return   amp      value charCodeAt 0                     replace   lt  g    amp lt         replace   gt  g    amp gt        export default  encode decode

[javascript] HTML-encoding lost when attribute read from input field

Examples related to javascript

Examples related to jquery

Examples related to html

Examples related to escaping

Examples related to html-escape-characters