Remove HTML Tags in Javascript with Regex

Question

I am trying to remove all the html tags out of a string in Javascript  Heres what I have    I can t figure out why its not working    any know what I am doing wrong    lt script type  text javascript  gt   var regex      lt     n    gt     var body     lt p gt test lt  p gt    var result   body replace regex       alert result     lt  script gt    Thanks a lot

User · Accepted Answer

Try this  noting that the grammar of HTML is too complex for regular expressions to be correct 100  of the time   var regex      lt     gt     gt   ig     body     lt p gt test lt  p gt       result   body replace regex        console log result     If you re willing to use a library such as jQuery  you could simply do this   console log     lt p gt test lt  p gt    text

User · Answer

This is a solution for HTML tag and  amp nbsp etc and you can remove and add conditions to get the text without HTML and you can replace it by any   convertHtmlToText passHtmlBlock       str   str toString      return str replace   lt    gt     gt      amp nbsp   amp zwnj   amp raquo   amp laquo   amp gt  g   ReplaceIfYouWantOtherWiseKeepItEmpty

User · Answer

Here is how TextAngular  WYSISYG Editor  is doing it  I also found this to be the most consistent answer  which is NO REGEX    license textAngular Author   Austin Anderson License   2013 MIT Version 1 5 16    turn html into pure text that shows visiblity function stripHtmlToText html        var tmp   document createElement  DIV        tmp innerHTML   html      var res   tmp textContent    tmp innerText            res replace   u200B           zero width space     res   res trim        return res

User · Answer

This is an old question  but I stumbled across it and thought I d share the method I used  var body     lt div id  quot anid quot  gt some  lt a href  quot link quot  gt text lt  a gt  lt  div gt  and some more text   var temp   document createElement  quot div quot    temp innerHTML   body  var sanitized   temp textContent    temp innerText   sanitized will now contain   quot some text and some more text quot  Simple  no jQuery needed  and it shouldn t let you down even in more complex cases

User · Answer

lt html gt   lt head gt   lt script type  text javascript  gt  function striptag    var html      lt     gt     gt   gi  for  i 0  i  lt  arguments length  i    arguments i  value arguments i  value replace html         lt  script gt   lt  head gt    lt body gt          lt form name  myform  gt   lt textarea class  comment  title  comment  name comment rows 4 cols 40 gt  lt  textarea gt  lt br gt   lt input type  button  value  Remove HTML Tags  onClick  striptag this form comment   gt   lt  form gt   lt  body gt   lt  html gt

User · Answer

The way I do it is practically a one-liner   The function creates a Range object and then creates a DocumentFragment in the Range with the string as the child content   Then it grabs the text of the fragment  removes any  invisible  zero-width characters  and trims it of any leading trailing white space   I realize this question is old  I just thought my solution was unique and wanted to share      function getTextFromString htmlString        return document          createRange              Creates a fragment and turns the supplied string into HTML nodes          createContextualFragment htmlString             Gets the text from the fragment          textContent            Removes the Zero-Width Space  Zero-Width Joiner  Zero-Width No-Break Space  Left-To-Right Mark  and Right-To-Left Mark characters          replace    u200B- u200D uFEFF u200E u200F  g                 Trims off any extra space on either end of the string          trim       var cleanString   getTextFromString   lt p gt Hello world  I  lt em gt love lt  em gt   lt strong gt JavaScript lt  strong gt     lt  p gt      alert cleanString

User · Answer

Like others have stated  regex will not work  Take a moment to read my article about why you cannot and should not try to parse html with regex  which is what you re doing when you re attempting to strip html from  your source string

User · Answer

you can use a powerful library for management String which is undrescore string js      a  lt a href     gt link lt  a gt    stripTags         a link      a  lt a href     gt link lt  a gt  lt script gt alert  hello world    lt  script gt    stripTags         a linkalert  hello world      Don t forget to import this lib as following              lt script src  underscore js  type  text javascript  gt  lt  script gt           lt script src  underscore string js  type  text javascript  gt  lt  script gt           lt script type  text javascript  gt    mixin   str exports    lt  script gt

User · Answer

For a proper HTML sanitizer in JS  see http   code google com p google-caja wiki JsHtmlSanitizer

User · Answer

The selected answer doesn t always ensure that HTML is stripped  as it s still possible to construct an invalid HTML string through it by crafting a string like the following       lt  lt h1 gt h1 gt foo lt  lt    lt  h1 gt h1  gt     This input will ensure that the stripping assembles a set of tags for you and will result in       lt h1 gt foo lt  h1 gt     additionally jquery s text function will strip text not surrounded by tags   Here s a function that uses jQuery but should be more robust against both of these cases   var stripHTML   function s        var lastString       do                       s       lt div gt    html lastString   s  text          while lastString     s        return s

User · Answer

my simple JavaScript library called FuncJS has a function called  strip tags    which does the task for you     without requiring you to enter any regular expressions   For example  say that you want to remove tags from a sentence - with this function  you can do it simply like this   strip tags  This string  lt em gt contains lt  em gt   lt strong gt a lot lt  strong gt  of tags       This will produce  This string contains a lot of tags     For a better understanding  please do read the documentation at GitHub FuncJS   Additionally  if you d like  please provide some feedback through the form  It would be very helpful to me

User · Answer

This worked for me      var regex      amp nbsp   lt     gt     gt   ig           body   tt          result   body replace regex              alert result

[javascript] Remove HTML Tags in Javascript with Regex

Examples related to javascript

Examples related to regex