Is there any easy way to remove all HTML tags or ANYTHING HTML related from a string?
For example:
string title = "<b> Hulk Hogan's Celebrity Championship Wrestling <font color=\"#228b22\">[Proj # 206010]</font></b> (Reality Series, )"
The above should really be:
"Hulk Hogan's Celebrity Championship Wrestling [Proj # 206010] (Reality Series)"
You can parse the string using Html Agility pack and get the InnerText.
HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(@"<b> Hulk Hogan's Celebrity Championship Wrestling <font color=\"#228b22\">[Proj # 206010]</font></b> (Reality Series, )");
string result = htmlDoc.DocumentNode.InnerText;
You can use the below code on your string and you will get the complete string without html part.
string title = "<b> Hulk Hogan's Celebrity Championship Wrestling <font color=\"#228b22\">[Proj # 206010]</font></b> (Reality Series, )".Replace(" ",string.Empty);
string s = Regex.Replace(title, "<.*?>", String.Empty);
Source: Stackoverflow.com