[javascript] Preventing HTML and Script injections in Javascript

Assume I have a page with an input box. The user types something into the input box and hits a button. The button triggers a function that picks up the value typed into the text box and outputs it onto the page beneath the text box for whatever reason.

Now this has been disturbingly difficult to find a definitive answer on or I wouldn't be asking but how would you go about outputting this string:

<script>alert("hello")</script> <h1> Hello World </h1>

So that neither the script is executed nor the HTML element is displayed?

What I'm really asking here is if there is a standard method of avoiding both HTML and Script injection in Javascript. Everyone seems to have a different way of doing it (I'm using jQuery so I know I can simply output the string to the text element rather than the html element for instance, that's not the point though).

This question is related to javascript html javascript-injection html-injections

The answer is


From here

var string="<script>...</script>";
string=encodeURIComponent(string); // %3Cscript%3E...%3C/script%3

Try this method to convert a 'string that could potentially contain html code' to 'text format':

$msg = "<div></div>";
$safe_msg = htmlspecialchars($msg, ENT_QUOTES);
echo $safe_msg;

Hope this helps!


myDiv.textContent = arbitraryHtmlString 

as @Dan pointed out, do not use innerHTML, even in nodes you don't append to the document because deffered callbacks and scripts are always executed. You can check this https://gomakethings.com/preventing-cross-site-scripting-attacks-when-using-innerhtml-in-vanilla-javascript/ for more info.


Use this,

function restrict(elem){
  var tf = _(elem);
  var rx = new RegExp;
  if(elem == "email"){
       rx = /[ '"]/gi;
  }else if(elem == "search" || elem == "comment"){
    rx = /[^a-z 0-9.,?]/gi;
  }else{
      rx =  /[^a-z0-9]/gi;
  }
  tf.value = tf.value.replace(rx , "" );
}

On the backend, for java , Try using StringUtils class or a custom script.

public static String HTMLEncode(String aTagFragment) {
        final StringBuffer result = new StringBuffer();
        final StringCharacterIterator iterator = new
                StringCharacterIterator(aTagFragment);
        char character = iterator.current();
        while (character != StringCharacterIterator.DONE )
        {
            if (character == '<')
                result.append("&lt;");
            else if (character == '>')
                result.append("&gt;");
            else if (character == '\"')
                result.append("&quot;");
            else if (character == '\'')
                result.append("&#039;");
            else if (character == '\\')
                result.append("&#092;");
            else if (character == '&')
                result.append("&amp;");
            else {
            //the char is not a special one
            //add it to the result as is
                result.append(character);
            }
            character = iterator.next();
        }
        return result.toString();
    }

A one-liner:

var encodedMsg = $('<div />').text(message).html();

See it work:

https://jsfiddle.net/TimothyKanski/wnt8o12j/


I use this function htmlentities($string):

$msg = "<script>alert("hello")</script> <h1> Hello World </h1>"
$msg = htmlentities($msg);
echo $msg;